View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

Federal Reserve Bank of Chicago

A Nonparametric Analysis of BlackWhite Differences in Intergenerational
Income Mobility in the United States
Debopam Bhattacharya and
Bhashkar Mazumder

REVISED
March 2011
WP 2007-12

A Nonparametric Analysis of Black-White
Differences in Intergenerational Income Mobility in
the United States
Debopam Bhattacharya∗

Bhashkar Mazumder

University of Oxford

Federal Reserve Bank of Chicago
March, 2011.

Abstract
Lower intergenerational income mobility for blacks is a likely cause behind the
persistent inter-racial gap in economic status in the US. However, few studies have
analyzed black-white differences in intergenerational income mobility and the factors that determine these differences. This is largely due to the absence of appropriate methodological tools. We develop nonparametric methods to estimate the
effects of covariates on two measures of mobility. We first consider the traditional
transition probability of movement across income quantiles. We then introduce a
new measure of upward mobility which is the probability that an adult child’s relative position exceeds that of the parents. Conducting statistical inference on these
mobility measures and the effects of covariates on them requires nontrivial modifications of standard nonparametric regression theory since the dependent variables
are nonsmooth functions of marginal quantiles or relative ranks. Using NLSY data,
we document that blacks experience much less upward mobility across generations
than whites. Applying our new methodological tools, we find that most of this gap
can be accounted for by differences in cognitive skills during adolescence.
Keywords: intergenerational mobility, upward mobility, nonparametric regression, Hadamard differentiability, black-white mobility gap; JEL codes: C14, D31
∗

Correspondence address: Department of Economics, Oxford University, Manor Road Building, Manor

Road, OX1 3UQ, UK; phone number 44-7503-858289; Fax: +44 1865 271094, email address: debobhatta@gmail.com

1

1

Introduction

A topic of long standing interest among social scientists is the persistent disadvantage
in economic status faced by blacks in the US many generations after the end of slavery
and several decades after the elimination of state sanctioned segregation. The fact that
the racial gap is so highly persistent suggests that blacks in the US may have low rates
of upward intergenerational economic mobility. While there is a vast literature on the
black-white earnings gap and a growing number of studies on intergenerational mobility
(IGM) in the US, very few studies have examined differences in the rates of IGM between
racial groups in the US and the underlying sources behind those differences. Such an
analysis could potentially provide insight into why there is such persistence in racial
inequality and whether this may reflect greater inequality of opportunity for blacks. A
better understanding of the factors behind differences in IGM could also inform policies
intended to address racial gaps such as early life interventions or affirmative action policies.
This paper studies racial differences in relative income mobility over generations, using
US data from the National Longitudinal Survey of Youth (NLSY) which contains large
intergenerational samples of both blacks and whites.
The dearth of studies on racial differences in IGM is due, in part, to the fact that most
recent research in this area has focused on using one particular measure, the intergenerational elasticity (IGE) which is simply the regression coefficient obtained by regressing
(log) child’s permanent income on (log) parents’ permanent income.1 The IGE provides a
measure of income persistence, and 1 minus the IGE is widely used as a measure of relative mobility. However, the IGE cannot be used to compare mobility differences between
population subgroups with respect to the entire distribution. For example, the IGE for
blacks only describes the rate at which earnings among black children regress to the black
mean —not the mean of the entire distribution.
An alternative approach is to calculate transition probabilities to describe the rates of
movement across specific quantiles of the distribution over a generation. Since transition
probabilities can measure the movements of blacks across the income distribution of the
entire population comprising both blacks and whites, one can make meaningful statements
concerning racial differences in mobility.
However, a difficulty arises with transition probabilities if one wants to estimate rates
1

The intergenerational correlation (IGC), used by many researchers, is qualitatively similar and the

two measures are equivalent when the variance in in income is unchanging across generations.

2

of IGM conditional on (continuous) covariates like test scores. For the IGE, it is straightforward to measure effects of covariates, one simply needs to include them along with
their interactions with parents’ income as additional regressors and the statistical theory is straightforward. In contrast, a formal statistical method for using covariates in
transition matrices is lacking. The development of such a methodology would allow one
to investigate the underlying mechanisms behind black-white differences in IGM. For example, it is often hypothesized that inadequate parental investment in children’s human
capital could lead to reduced mobility. Therefore it is natural to consider the association
between IGM and measures of human capital such as education and test scores. Previous
research using the NLSY has shown that cognitive skills during adolescence as measured
by percentile scores on the Armed Forces Qualifying Test (AFQT) can account for blackwhite gaps in levels of educational attainment and adult wages (Cameron and Heckman,
2001; Neal and Johnson, 1996); so it would be useful to examine if this result extends to
measures of changes in relative position across generations.
In order to address the above void in the literature, we develop in this paper a nonparametric statistical methodology for analyzing conditional transition probabilities. The
relevant inference theory for marginal transition probabilities was previously developed in
Formby, Smith and Zheng (2004). When the relevant covariates are discrete, one can simply apply their results within each covariate category to conduct inference on conditional
transitions. But with continuously distributed covariates, the parameters of interest are
infinite-dimensional and thus nonparametric smoothing methods are warranted.23 Using
a standard parametric model for conditional mobility (e.g., a probit) is problematic here
because it is unclear what type of joint distribution of errors will imply a probit form
for transition probabilities; in particular, a bivariate normal error distribution will not.
Furthermore, a probit will be shown below to produce misleading qualitative conclusions
in our empirical analysis.
2

The AFQT percentile scores take on values from 1 to 99. Given our sample size, analysis by

racial group within each percentile of AFQT will be very imprecise. Furthermore, small differences in
AFQT percentiles, unlike differences in race, are unlikely to imply big changes in functional relationships.
Therefore, treating AFQT score as a continuously distributed covariate will be the natural and correct
approach.
3
Trede (1998) has shown how continuous covariates can be included in analyzing a class of inequality
reducing functionals that have been used to measure intra-generational mobility but not for measures
based on transition probabilities.

3

While transition probabilities can be effectively used to compare relative mobility
across subgroups, its substantive drawback is its overtly disaggregate nature, i.e., there
are an infinite number of transition probabilities depending on which quantile is chosen to
be the threshold for the sons, and a summary measure of mobility across relative income
positions would be useful for consolidating the information provided in transition matrices. Therefore, we introduce a new alternative measure of upward mobility. Specifically,
we condition on families where the father’s income is at or below a particular percentile
(say, the median) and then estimate the probability that the income rank of a son randomly picked from such a family exceeds his parents’ rank in the prior generation. It is a
single easily interpretable summary measure and its value does not depend on arbitrary
discretization of income distributions. Further, as explained below, it is more robust than
the transition probability to heterogeneity of income within the reference group of families
that we started with. We conduct statistical inference on both the level of this mobility
measure and the effects of covariates on that level, using NLSY data. Establishing the
relevant distribution theory requires some methodological innovation because IGM measures involve outcome variables that contain non-smooth functions of initially estimated
rank-type functionals. Standard nonparametric regression theory is therefore inadequate
for this purpose.
Our first set of empirical results document the large racial disparity in transition
probabilities out of the bottom of the income distribution. For example, we find that
blacks are 26 percent less likely to move out of the bottom quartile than are whites. We
then show that black-white differences in mobility are much smaller when based on our
measure of upward mobility. This is because (i) our measure captures the fact that many
blacks exhibit small upward movements in relative ranks which are ignored by transition
probabilities and (ii) because black income levels below any given threshold tend to be
smaller than white income levels, blacks need much larger absolute gains to surpass a
common percentile threshold, leading them to have lower transition probability.
Our most striking finding, however, is that when we (nonparametrically) control for
AFQT scores, most of the racial gap in IGM disappears— whether measured via transition probability or upward mobility. This finding suggests that policy makers should
potentially place particular emphasis on interventions earlier in life that may determine
cognitive ability differences by adolescence rather than strategies targeting adult labor
markets. Indeed, growing evidence shows that black-white differences in tests scores can

4

be strongly affected by environmental influences (e.g. Neal and Johnson (1996); Hansen,
Heckman, and Mullen 2004; Chay, Guryan, and Mazumder 2009) and do not simply reflect innate differences. Our results also extend the current literature which has focused
on levels of economic status, by showing that test scores can also potentially explain racial
differences in upward movements in economic status over generations.
We offer a few caveats to our study. Like most of the existing IGM literature, our
analysis is essentially descriptive in nature and one should be cautious in attaching causal
interpretations to them. Additionally, all measures used in the present paper are based
on relative positions and do not pertain to movements out of absolute thresholds such as
from under the poverty line.
The rest of the paper proceeds as follows: section 2 presents a review of some related
papers on black-white differences in IGM; section 3 describes and discusses the parameters
of interest; section 4 states the asymptotic distribution theory required for conducting
statistical inference on these measures; section 5 describes the NLSY data; section 6
presents the empirical results and section 7 concludes. All technical proofs as well as a
technical lemma (lemma 1) on Hadamard differentiability are collected in the appendix.
Throughout the paper the symbol ":=" will denote equality by definition.

2

Related Literature

Research on intergenerational mobility (IGM) has been primarily motivated by the notion
of equality of opportunity. Societies in which economic status in one generation is largely
determined by economic status in the prior generation are thought to be less fair and
provide less opportunity. Measures of mobility such as the intergenerational elasticity
and transition probabilities are designed to capture the degree of persistence in economic
status.4 Black and Devereux (2010) and Solon (1999) provide extensive surveys of research
on IGM.
There is also an extensive literature on black-white differences in economic status.
Fryer (2010) provides a review of this literature. These studies have typically had little to say about IGM in part, because of the lack of appropriate measures for studying
group differences but also because most intergenerational samples of blacks are too small
4

The interpretation of mobility measures as indices of "equality of opportunity" has been critiqued

by several authors (e.g., Van de Gaer et al, 2001, Roemer (2004), Swift (2005), Jencks and Tach (2006)).

5

for inference. A few studies have addressed the racial dimension of the intergenerational
transmission of status but did not estimate the rate of income mobility. For example
Datcher (1981) using the PSID regressed adult outcomes on family background characteristics, separately by race and sex and finds that black families are not as successful as
white families in translating parental economic gains into offspring achievement. Corcoran and Adams (1997) also use the PSID and compare how the probability of escaping
poverty differs by race and sex and how these estimates are affected by adding covariates.
They find that black children living in poverty are substantially more likely to be poor as
adults and that structural economic factors during childhood (e.g. local unemployment
rates) play a key role.
Hertz (2005) was the first to use transition probabilities to directly measure intergenerational income mobility by race. Using the PSID, Hertz estimated very large racial
differentials in the probability of leaving the bottom quartile with blacks substantially less
upwardly mobile. Using probit models, Hertz, found that these racial differences could
not be explained by parental income or education. However, the use of a probit is not
econometrically not well-grounded since no natural joint distribution (such as a bivariate
normal or log-normal) of underlying income variables implies a probit form for transition
probabilities.5 Incidentally, in his study, Hertz provides some motivation for our construction of a summary measure of upward mobility. He points out that with transition
matrices “the problem ... is that there is no best way to summarize their contents".6

3

Parameters of interest

We first describe the parameters of interest based on transition probabilities and then
those related to our new measure of upward mobility.
5

Given the relatively smaller samples in the PSID and possible concerns about the quality of the

data on blacks, it would be useful to produce similar estimates using the NSLY. Although Hertz makes
maximum use of the PSID by including the oversample of poorer households there is some concern about
the validity of the sample due to issues related to availability of the initial sampling frame (Lee and Solon,
2009). In addition there has been substantial attrition of black families in the PSID.
6
Hertz (2008) develops an estimator that takes into account both the within- and between- group
effects and finds that blacks have less mobility. However, it is not clear how to interpret the scale of the
measure, whether it has desirable properties and can be used for producing conditional estimates.

6

3.1

Conditional transition probabilities

Let 0 (·) and 1 (·) denote the c.d.f. of the overall income distribution for fathers and
sons, respectively. Then, the transition probability measures the probability that a son is
at or above the th quantile of 1 , conditional on his father being between the 1 th and
2 th quantiles of 0 (·), i.e.,
 ( (1  2 )) =

Pr [1 (1 ) ≥  1 ≤ 0 (0 ) ≤ 2 ]
.
Pr [1 ≤ 0 (0 ) ≤ 2 ]

(1)

Notice that  ( (1  2 )) can be decomposed by levels of discrete and continuous covariates
 such as age and education of the father and/or the son as

where

 ( (1  2 ))
Z
=
Pr [1 (1 ) ≥ |1 ≤ 0 (0 ) ≤ 2   = ]  (|1 ≤ 0 (0 ) ≤ 2 )
Z
: =  (;  (1  2 ))  (|1 ≤ 0 (0 ) ≤ 2 ) ,
 (;  (1  2 )) = Pr [1 (1 ) ≥ |1 ≤ 0 (0 ) ≤ 2   = ] ,

(2)

is labeled the "conditional transition probability". Notice that 1 and 0 in (2) still refer
to unconditional income distributions in the two generations. To interpret (2) directly,
consider a numerical example. Suppose  denotes whether a father is black,  = 05,
1 = 04, 2 = 06. Suppose there are a total of 100 father-son pairs within the (04 06)
quantiles of the entire income distribution of the fathers’ generation. Of these, suppose
40 are black and 60 are white. Suppose 10 of the 40 black sons earn above the th percentile (in the sons’ generation) and 30 of the 60 white sons do the same. Then, the
overall transition probability is  ( (1  2 )) =(10+30)/100=0.4. The conditional transition probability among blacks is  (; 05 (04 06)) =10/40=0.25 and that among
whites is  (; 05 (04 06)) =30/60=0.5. The white minus black difference is 0.25,
suggesting higher mobility among whites.
One can analogously define transition probability conditional on more covariates, e.g.,
race and father’s college attainment, and use it to measure the black-white difference in
transition at each value of father’s education:
̃ (;  (1  2 )) − ̃ (;  (1  2 )) ,
where

7

̃ (;  (1  2 )) = Pr [1 (1 ) ≥ |1 ≤ 0 (0 ) ≤ 2   =   = 1]

(3)

̃ (;  (1  2 )) = Pr [1 (1 ) ≥ |1 ≤ 0 (0 ) ≤ 2   =   = 0] .
To interpret (3), refer to the previous numerical example and let  = 1 denote that the
father went to college. Now suppose that among the 40 black fathers, 10 went to college
and 30 did not. Of the 10 black sons whose fathers went to college, suppose 6 surpassed
the th quantile, and suppose that of the 30 black fathers who did not attend college,
only 4 sons surpassed the th percentile. Similarly, suppose that among the 60 white
fathers, 50 went to college of whom 25 sons surpassed the the th quantile and among
10 white fathers who did not attend college, 5 sons surpassed the th percentile. Then
̃ (; 05 (04 06)) = 2550 = 05 and ̃ (; 05 (04 06)) = 610 = 06
and the difference is -0.1. These hypothetical numbers would suggest that although blacks
in the (04 06) class are less mobile than whites on average, the college educated blacks
among them are in fact more mobile than college educated whites. In section 6 we will
explore racial differences in conditional transition probabilities nonparametrically using
actual data on sons’ education and AFQT percentile scores.

3.2

Upward mobility

We formally introduce our new measure of upward mobility in this section. We first
present the analytic expressions and then discuss the substantive features which make
our measure both intuitively appealing and analytically different from measures based on
transition probabilities.
Our direct measure of upward mobility is simply the probability that the son’s percentile rank in the overall income distribution of his generation exceeds that of his parents’
in the income distribution of the parents’ generation by a fixed amount. Let 0  1 denote father and son’s income with respective marginal c.d.f.’s 0 and 1 . Then for fixed
0  1  2  1, we define upward mobility for families between 1 th and 2 th quantiles
by an extent  ∈ [0 1 − 2 ] as
 (  1  2 ) = Pr (1 (1 ) − 0 (0 )   |1 ≤ 0 (0 ) ≤ 2 ) .

(4)

One can alternatively define upward mobility by conditioning on 0 (0 ) =  rather
than 1 ≤ 0 (0 ) ≤ 2 , and thus avoid aggregation bias due to income heterogeneity

8

within the interval (1  2 ). However, conditioning on 0 (0 ) =  would require additional
smoothing since 0 (0 ) would be continuously distributed. By making the length of the
interval (1  2 ) small, we would both avoid this smoothing and yet remove some of the
aggregation bias. Furthermore, defining the reference group to be 1 ≤ 0 (0 ) ≤ 2 is
also consistent with how transition probabilities have been traditionally defined in the
literature and thus enables direct comparison and contrast between the two measures.7
In analogy with (2) above, one may introduce covariates  into the analysis and define
conditional upward mobility at  =  as
  (;   1  2 )
= Pr (1 (1 ) − 0 (0 )   |1 ≤ 0 (0 ) ≤ 2   = ) .

(5)

The idea is that we start with all families where the father was between the 1 th and 2 th
percentile. This ensures that all the corresponding sons have equal “space to move up".
With these families constituting our population, we then evaluate the extent of upward
mobility for different groups defined by values of . Below, we derive the statistical distribution theory for estimates of  (  2  1 ) and  (;   2  1 ). In section 6 we will contrast
overall upward mobility among blacks versus whites and then analyze how controlling for
relevant covariates affects this difference.
Contrasting upward mobility with transition probability: A key feature of
 (  1  2 ) is that it counts the sons’ small upward movements in relative positions from
their fathers’ which are ignored by transition probabilities. Comparing upward mobility
 (  1  2 ) = Pr (1 (1 )  0 (0 ) +  |1 ≤ 0 (0 ) ≤ 2 )
and the transition probability
 (2  (1  2 )) = Pr [1 (1 )  2 |1 ≤ 0 (0 ) ≤ 2 ] ,
one can see that unlike transition probability,  (  1  2 ) is counting those sons whose
ranks exceeded their fathers’ by  but did not necessarily exceed 2 . The resulting magnitude of difference between the two measures, however, is an empirical question and
depends on the joint distribution of (0  1 ). Below, we graph these differences for the
the lognormal case.
7

Conditioning 0 (0 ) = , will require averaging over a subjectively determined bandwidth around 

, and thus entertain a certain amount of income heterogeneity in finite samples. In addition, the inference
theory will become far more complicated. So we do not pursue this option here.

9

An advantage of our upward mobility measure can be readily seen when comparing
mobility between, say, whites versus blacks. Suppose we condition on fathers with incomes
at or below the median income in the fathers’ generation., i.e., 1 = 0 and 2 = 05.
Suppose this interval is (0 $40). Then it is reasonable to expect that black incomes are
concentrated in the lower end of (0 $40) and white incomes in the upper end (see fig.
5 below which compares the C.D.F. of whites and black parental income in the bottom
quintile). Now, the transition probability counts only those sons who exceed the median
income for the sons’ generation so that black sons have to make a much larger income gain
than white sons to be counted as having made progress. Our upward mobility measure
corrects this by requiring all sons to have advanced by (at least) the same amount  w.r.t.
their fathers’ percentile rank in order for them to be counted as having progressed.
The extent of movement  controls how much we want to include small movements
in relative position. With  = 0, every positive movement is counted, however small.
As  rises, we count only larger movements in relative position but keep this extent of
movement the same for all subgroups of the population. This implies, however, that we are
applying different absolute thresholds to sons of different subgroups, e.g., black sons will
be counted as having moved up even if their rank has not crossed an absolute threshold as
long as they have made sufficient progress relative to their own fathers. In contrast, the
transition probability keeps the absolute threshold constant, and hence counts wealthier
subgroups with smaller extent of movements but ignores some poorer subgroups with
larger extent of movement from their parents’ rank.
As we will show in section 6, black-white differences in mobility (c.f., figure 4) are
in fact, quite different, depending on which measure is used. Specifically, we find that
whites appear to be much more upwardly mobile relative to blacks when measured by
the transition probability of moving out of a given quantile. The white-black difference
in mobility, however, is much smaller when measured in terms of our upward mobility
index. The key reason for this is that many black sons make relatively small upward
movements which are missed by  (2  (1  2 )) but are captured by  (  1  2 ). Incomes
of white fathers tend to be larger than that of black fathers (figure 5) below virtually all
fixed percentile thresholds so that black sons need a larger increase in absolute income to
be counted by the transition probability measure.
Lognormal illustrations: To gain some concrete insight into these measures, denote
incomes for father and son by (0  1 ) and assume that incomes are jointly log-normal

10

with  (ln 0  ln 1 ) =  ≥ 0. Then calculation of (2) and (5) in this lognormal model
(note that these measures are invariant to mean and variance changes) yields, for example,
!#
"
Ã
−1
¡
¢
1
(Φ
()
+

)

−
Φ
p
,
 (  0 ) =  1   Φ−1 () × Φ

1 − 2
!#
"
Ã
−1
¡
¢
1
()

−
Φ
p
,
 1   Φ−1 () × Φ
 ( (0 )) =

1 − 2

where ~ (0 1) and Φ,  denote its c.d.f. and density, respectively. Note that  represents the intergenerational correlation that is typically estimated to be in the 0.2 to 0.5
range for industrialized countries. Appendix Figure A1 plots the transition probability
(black line), upward mobility (gray line) and the difference between them (dotted line)
as functions of , for  = 005 (panel A), 020 (panel B) and 050 (panel C). The graphs
show that: (i) both mobility measures decline from 1 to 0 as  rises from very negative
(fortune reversal) to very positive (high persistence); (ii) when  is closer to 0 as in panel
A, the upward mobility measure is including smaller gains in ranks and is thus higher
than the transition probability for fixed ; (iii) for any fixed  , the difference between
upward mobility and transition probability is close to zero when  is close to -1, rises with
, reaches a peak, then declines as  increases.
When the population is a mixture of several subgroups, e.g., blacks and whites, and
the difference in the two measures becomes analytically intractable. However, one can
surmise from the single group analysis that if blacks are poorer than whites within every
quantile class but have similar values of , then the difference between the upward mobility
measure and the transition probability for blacks will be higher than the corresponding
difference for whites. So the white-black difference in transition probabilities will be
higher than that for upward mobility. This effect may be either enhanced or mitigated if
the intergenerational correlation in income for whites exceeds that for blacks (note that in
Figure A1, the difference in the two measures first rises and then falls as  increases). The
intergenerational correlation in our data for blacks and whites are quite similar at around
0.25 to 0.3 for each.8 Therefore our empirical finding of higher inter-racial differences in
transition probability is most likely due to the first reason, viz., the inter-racial differences
in income distribution below the reference threshold. This suggests that upward mobility
8

These values are lower than what is typically found in the literature. This is mostly because the

sample is not conditioned on having a minimum number of years of parent income (e.g. 5 year averages)
as in Solon (1992) to address attenuation bias.

11

is a useful alternative, or at least complementary, measure to consider rather than relying
on just transition probabilities for the quantitative assessment of income mobility.
Alternative definition of mobility: One can alternatively define overall mobility
based on transition matrices after incorporating effects of covariates. Consider a transition
matrix based
n on an
o arbitrary M-class discretization of the marginal distributions of 0 and

1 : Θ̃ = ̃ ( )

=1

. Then Shorrock’s (1978) measure of unconditional mobility is

given by

P
 − (Θ̃)
=1 ̃ ( ) − 1
=1−
.
1 =
−1
−1
One can incorporate covariates into the above formula and define
P
=1 ̃ ( ; ) − 1
1 () = 1 −
−1
where

(6)

¡
¢
̃ ( ; ) = Pr   ≤ 1 ≤  +1 |  ≤ 0 ≤  +1   =  ,

and   and   denote the th marginal quantiles of 1  0 , respectively. Given the simple linear relation (6), inference on 1 () will follow straightforwardly from inference
on ̃ ( ; ). However, this measure will depend crucially on the discretization employed
which is clearly an undesirable feature. Altering the above formulas to allow for a continuous transition matrix seem complicated9 and we leave that to future research.

3.3

Measurement Error

Researchers working on earnings mobility have paid particular attention to measurement
error in sons’ and fathers’ earnings in the context of intergenerational regressions (c.f.,
Haider and Solon, 2006).10 It is interesting to note that all our measures of mobility are
based on the relative positions of individuals in the population. So if ranks of individuals
are preserved despite measurement errors, then our measures will not be affected by the
fact that we have erroneous earnings measures. One specific example of this is where
reported earnings are a monotone function of true earnings— i.e. if for two people 1 and
R1
P
The problem is that 0  ( )  is not a probability, unlike =1 ̃ ( ).
10
We are aware of only one study that has examined the effect of measurement error in the context
9

of transition probabilities. O’Neill et al (2007) show that measurement error can induce a modest bias
in transition probabilities compared to regressions and that this bias may vary at different points of the
distribution.

12

2, true incomes satisfy 1∗  2∗ , then their reported incomes satisfy 1  2 . This can
be easily consistent with non-classical measurement error, i.e.,  −  ∗ being negatively

correlated with true earnings  ∗ (Bound et al (1994)). In this case, all our measures

based on reported  will be identical to those based on  ∗ . With more general types
of measurement error, using time averaged incomes or earnings, as is common in the
IGM literature, can partially mitigate the effect of purely random measurement error in
addition to providing more reliable estimates for permanent income. This is the approach
we follow in the application below.

Finally our measures of son’s earnings are taken

around the age of 40 when life-cycle bias (Haider and Solon (2006)) is minimized.
Based on the parameters (2) and (5), one can define the corresponding marginal effects
by differentiating them with respect to the regressor values and/or summarize these by
density-weighted average derivatives, a la Powell, Stoker and Stock (1989). For brevity,
we do not pursue these quantities here.

4

Estimation and distribution theory

We now turn to estimation of the parameters and derivation of their asymptotic properties.
Note that we have defined 4 parameters above, viz., (1), (2), (4), (5). Formby, Smith
and Zheng (2004) had analyzed only (1) and so, in what follows, we will derive the
distribution theory for the other three. The analysis of (2) requires slight modification of
standard kernel regression theory. We only provide an outline of proof by pointing out
the modifications needed. But the estimators of (4) and (5) are fundamentally harder
to analyze owing to the presence of ̂1 () and ̂0 () in the definition of the dependent
variables. Our analysis of these will rely crucially on the idea of Hadamard differentiability
and we will use Hoeffding’s inequality to control the errors involved in the estimation of
̂1 () and ̂0 ().
Given that the support of income variables can be taken to be bounded below, without
loss of generality, we will assume that the supports of 0 and 1 are subsets of [1 ∞). Note

that all our mobility measures are based on quantiles and so fixed location shifts in either
variable does not affect any of the measures. For fixed , , denote the th quantile of 1 by

 1 and the th quantile of 0 by  0 with corresponding estimates by ̂ 1 , ̂ 0 . For transition
probability,  will denote the indicator 1 {1 ≤  1  0 ≤  0 } and for upward mobility, 

will denote 1 (1 (1 ) − 0 (0 )   ) and  will denote 1 {0 (0 ) ≤ } := 1 {0 ≤  0 } in

13

both cases.

4.1

Conditional transition probability

We will first state the distribution theory for estimating the conditional transition probability, c.f., (2)
 (;  ) = Pr [1 ≥  1 |0 ≤  0   = ] = 1 −

Pr [1 ≤  1  0 ≤  0 | = ]
.
Pr [0 ≤  0 | = ]

(7)

Now, (7) can be estimated by

̂ (;  ) = 1 −

1
 

´ ³
´
1 1 ≤ ̂ 1  0 ≤ ̂ 0
=1 
´ ³
´
³
,
P
 −
1
1


≤
̂
0

0
=1


³

P

 −


(8)



where  () is a -dimensional kernel and  is a sequence of bandwidths. Let  (0  1 |)

denote the density of (0  1 ) conditional on  =  and define
Z

1

Z

0

 (0  1 |) 0 1 ,
 (  0   1 ) = Pr {1 ≤  1  0 ≤  0 | = } =
1
1
´
³
´
³
P
 −
1
´
³
1 1 ≤ ̂ 1  0 ≤ ̂ 0

=1 
 

´
³
̂  ̂ 0  ̂ 1 =
,
P
 −
1


=1


´
³
P

 −
1
1 (1 ≤  1  0 ≤  0 )
=1 

 
´
³
̂ (  0   1 ) =
.
P
 −
1

=1




The asymptotic distribution for conditional (on covariates) transition probabilities are
based on the following proposition. We state this and subsequent propositions in terms
of a -dimensional  all of whose components are continuously distributed. For discrete
covariates, the analysis is identical to that for the marginal (i.e. unconditional) measures.
Proposition 1 Suppose that conditions NW1-5 in the appendix part A are satisfied.
Assume further that for  = , (0  1 ) admits a nonnegative joint density w.r.t. the
Lebesgue measure everywhere on the joint support. Then we have
´
´
¡  ¢12 ³ ³
 
̂  ̂ 0  ̂ 1 − ̂ (  0   1 ) =  (1) .

14

Proof. See appendix part A.

´
³
The implication is that the distribution of ̂  ̂ 0  ̂ 1 and that of the infeasible esti´
³
mator ̂ (  0   1 ) are identical. The argument is based on the observation that ̂ 0  ̂ 1
√
converges at the parametric  rate but ̂ (  0   1 ) converges to  (  0   1 ) slower than
√
-rate and a standard equicontinuity argument can then be used to handle the nonsmoothness of 1 (1 ≤  1  0 ≤  0 ) in the ’s. Note also that through our assumptions, we

have used an "undersmoothed" estimator to achieve bias reduction and omitted bounded
moment assumptions on the errors because the dependent variable and  (· · ·) lie in [0 1].
An exactly analogous proposition with a virtually identical proof applies to
´ ³
´
³
P
³
´ 1  =1   −
1

≤
̂
0
0


´
³
̂  ̂ 0 =
.
P

 −
1
=1 




´ ³
´
³
Returning to (8), we note that ̂ (;  ) = ̂  ̂ 0  ̂ 1 ̂  ̂ 0 and its asymptotic

distribution follows by the standard delta method and the Cramer-Wold device.

4.2

Marginal upward mobility

We now consider the notationally simpler version of  (  ) defined in (4)11
 (  ) = Pr (1 (1 ) − 0 (0 )   |0 (0 ) ≤ )

(9)

which can be estimated by
̂ (  ) = 1 −

1


³
´ ³
´
1
(
)
≥
̂
(
)
+

1
(
)
≤

̂
̂
1
1
0
0
0
0
=1

P

where
̂1 (1 ) =



(10)

1X
1 (1 ≤ 1 ) .
 6=

We will now state the asymptotic distribution of ̂ (  ). Let  ( ) denote the joint
c.d.f. of (0  1 ) with corresponding joint density  ( ). Then for fixed³ ´ , one may
view  (  ) as a functional  ( ). We can therefore estimate it by  ̂ , where ̂
³ ´
denotes the usual empirical c.d.f.. We will obtain a large sample distribution of  ̂ .
The key step is showing that the functional  7→  ( ) is smooth in the Hadamard sense,
11

One can move from (9) to (4) using simple subtractions.

15

with a derivative at  given by a linear functional  0 ().12 This is done in lemma 1 in
appendix part B and the relevant tail conditions and the proof are stated there as well.
If one assumes that the joint density of (0  1 ) is bounded away from zero on a compact
support, then the proof is considerably simpler. This assumption would, however, exclude
families with "abnormally" high and low earnings in either generation—which is typically
where the density will be close to zero — and this is clearly undesirable. So we establish
Hadamard differentiability under more general tail conditions on the joint density and its
partial derivatives.

³ ´
Given this lemma, the asymptotic distribution of ̂ =  ̂ can be derived as follows.

Let à denote standard
weak
´ convergence of distribution functions and define the Gaussian
√ ³
process G by  ̂ −  Ã G. Then from lemma 1 and the functional delta method, we

have that

√

 (̂ − 0 ) → 0 (G) ,

whence 0 (G) will be distributed as a univariate zero mean normal. See technical appendix part B for the exact form of this distribution.
It is well-known that the bootstrap provides consistent
approximations
to the as´
√ ³
ymptotic distribution of the sample c.d.f. process  ̂ −  .13 Using the Hadamard

differentiability result of our lemma 1, it follows, via the functional delta method for the
bootstrap in probability (c.f., van der Vaart and Wellner (1996), theorem 3.9.11), that
bootstrapping will lead to consistent approximation of the distribution of the estimator

of ̂. We summarize the above discussion in the following proposition, where "regularity
conditions A and B" refer to those in lemma 1 in the appendix part B and ̂ ∗ denotes the
bootstrap version of ̂.
Proposition 2 Under regularity conditions A and B, the bootstrap distribution of
√
will consistently estimate the distribution of  (̂ − 0 ).

√
 (̂∗ − ̂)

Proof. Follows from Hadamard differentiability of the map  :→  ( ) (lemma 1 in
appendix B) by applying the functional delta method. See Appendix part B for details.
12

The concept of Hadamard differentiability has been used before in the context of analyzing features

of univariate income distributions, c.f. Bhattacharya (2007) and Barrett and Donald (2000). The results
obtained here involve more complicated functionals of bivariate distribution functions and are not related
to the results in the above papers.
13
For a textbook treatment, see theorem 3.6.1 part (iii) in van der Vaart and Wellner (1996) and its
discussion on page 346 of the same text.

16

In the application discussed below, we use the bootstrap to approximate standard
errors for the marginal upward mobility by race (table 2) and for mobility by race and
parent income (table 4). We also provide a histogram for the bootstrap distribution
(figure 3) and summarize some descriptive measures pertaining to the distribution, such
as moments, skewness and kurtosis (table 3). Standard tests fail to reject normality of the
distribution, as is to be expected, given the Gaussian form of the ingredients of  0 (G).

4.3

Conditional upward mobility

Recall from (5) that conditional upward mobility is given by
  (  ; ) = Pr (1 (1 ) − 0 (0 )   |0 (0 ) ≤   = ) .
Its natural estimates then is
̂ (  ; )
´ ³
³
´
P
 −
1
1

(
)
−
̂
(
)



̂
(
)
≤

̂
1
1
0
0
0
0

=1


´ ³
³
´
=
.
P
 −
1
1

(
)
≤

̂
0
0
=1

 


Recall the definitions  := 1 (1 (1 ) − 0 (0 )   ) and  := 1 {0 (0 ) ≤ }.
Proposition 3 Suppose the data (  1  0 ) for  = 1 ,are i.i.d. and assumptions
NW1-4, NW5’ in the technical appendix part C hold. Then, we have that
¡  ¢12

[̂ (  ; ) −  (  ; )]
o
¡  ¢12 n
1
×  
=
× ̂ ( | = ) −  ( | = )
 ( | = )
o
¡  ¢12 n
 ( | = )
× 
× ̂ ( | = ) −  ( | = )
−
{ ( | = )}2
+ () ,
where ̂ ( | = ) and ̂ ( | = ) denote respectively the Nadaraya-Watson regres-

sion estimates of  and  on  and the remainder  () is  (1).

Asymptotic normality now follows by standard arguments for Nadaraya-Watson regressions, e.g., Bierens (1994), theorem 10.2.1 whose conditions are implied by assumptions NW1-4, NW5’.

17

Proof. Appendix part C
In our empirical application, we will want to compare the entire curve of the AFQTconditioned mobility for blacks with that of whites. Consequently, we need to construct
uniform confidence bands on these (difference in) regression curves, using the decomposition in proposition 3. This corresponds to testing the null hypothesis that the blackwhite difference in mobility conditional on   =  is zero for every . Based
on proposition 3 and strengthening  () =  (1) to sup | ()| =  (1) (see ap-

pendix part D), we can apply theorem 4.3.1 of Hardle, 1990 to the empirical process
¡  ¢12
 
[̂  (  ; ·) −  (  ; ·)] and construct uniform 95% confidence bands which will,
asymptotically, contain the true curve  (  ; ) with probability 95%. The details of

this construction and applicability of Hardle’s theorem are outlined in the appendix, part
D.

5

Data

We use the National Longitudinal Survey of Youth (NLSY) which starts with a sample of
individuals who were between the ages of 14 and 21 as of December 31st, 1978. Respondents were interviewed annually through 1994 and every other year since then. To avoid
dealing with issues related to labor force participation, we focus only on men in this study.
The NLSY has largely not been used for intergenerational analysis because parents of respondents were not interviewed. However, the parents of sons who were living at home
with their children during the first few years of the survey did report total family income.
Therefore, for those sons we use their average family income for any available years between 1978 and 1980. We also measure the sons’ average annual earnings as adults for
any of the available years covering 1996, 1998, 2000 and 2002.14 This allows us to include
individuals even if income data is missing in some years for either generation. The time
averaging also provides a better measure of permanent income in both generations. Our
sample restrictions lead to a sample of 2766 white and black men.
A key covariate for our analysis is the Armed Forces Qualifying Test (AFQT). All
individuals in the NLSY were given the AFQT test in 1980 as part of the renorming of
the test.15 For education we use years of completed schooling by age 26.
14
15

All income variables are deflated to 1978 dollars using the CPI-U.
Following Neal and Johnson (1996) we use the 1989 version of the percentile score.

18

6

Results

Our estimates use the two measures described earlier: transition probabilities and upward
mobility. We show both unconditional estimates and estimates conditional on AFQT.

6.1

Marginal probabilities

6.1.1

Upward transition probabilities

We begin by showing estimates of transition probabilities. We have simplified the notation from (1) to use a common cutoff, , in both generations. To facilitate comparisons
with the upward mobility measure we have introduced in this paper, we also consider
transition probabilities where the son must surpass the quantile by the amount  , viz.,
Pr [1 (1 )   +  |0 (0 ) ≤ ]. Confidence intervals for these are calculated using analogs

of the analytical formulae in Formby, Smith and Zheng (2004).

The results are shown in Table 1. In the first set of three columns we produce separate
estimates for whites, blacks, and the white-black difference for the baseline case where
 = 0. In the subsequent sets of columns we allow  to vary from 0.1 to 0.3. In each row
we condition on parent income being below the  percentile where  goes from 0.05 to
0.5 in increments of 0.05. The racial differences are striking. For example, the baseline
transition probability out of the bottom quartile is 71 percent for whites but only 45
percent for black, or a 26 percentage point difference. We plot the transition probabilities
for whites and blacks along with the pointwise 95 percent confidence intervals in Figure 1.
As is evident in the chart, except for those at the very bottom of the distribution (below
the 5th percentile), blacks are significantly less likely to surpass the quantile thresholds.
Interestingly, the white-black difference in the transition probability out of the bottom
quartile does not change very much as we allow  to vary.16 When we condition on parents
that are at or below the median and allow  to be large (0.2 to 0.3) then we find that the
interracial mobility gap begins to narrow to a smaller, but still significant, 10 percentage
point difference.
16

For example the racial gap in the probability of rising from the bottom quartile to at least the 45th

percentile (i.e.  = 02) is 23 percentage points.

19

6.1.2

Upward mobility

We now show an analogous set of estimates of our upward mobility measure
Pr (1 (1 ) − 0 (0 )   |0 (0 ) ≤ )
for whites and blacks and the white-black difference in Table 2. We now find much
smaller racial differences in our baseline case ( = 0). For example, among white men
whose family income during their youth was below the 25th percentile, 84 percent achieved
a higher percentile than their parents. The comparable figure for black men is 76 percent
implying a difference of about 8 percentage points. Figure 2 plots the results along with
pointwise 95% confidence bands in Figure 2. To calculate pointwise confidence intervals
for mobility , we compute the sample analog ̂ and then draw 200 bootstrap resamples
from our sample. The use of the bootstrap is justified via the functional delta method,
discussed above.17 For each bootstrap resample, we calculate the corresponding estimate,
√
∗
∗ and the
statistic

=
 |∗ − ̂|. We then calculate  ∗ , the 95th percentile of ∗ .
´
³
∗
∗
and use ̂ − √   ̂ + √  as our confidence interval. We calculate the standard errors

shown in the table,   , by taking the standard deviation of ∗ . The histogram for the
bootstrap distribution of ( ∗ − ̂) is plotted in figure 3 for the case of whites where  = 025

and  = 02.

We report the summary statistics (e.g. skewness, kurtosis and tests for

normality) for various values of s and  in Table 3. Because the histograms do not look
perfectly symmetric, we also calculated equal-tailed confidence intervals. Since we found
no consistent pattern in the relative size of the confidence intervals between the symmetric
and the equal-tailed ones, we chose to report the symmetric ones.
As figure 2 shows, aside from those whose family income was at or below the fifth
percentile, whites experience greater upward mobility than blacks but not nearly as much
as implied by transition probabilities. The gap in most cases, however, is statistically
significant as is shown in figure 4 where we plot the white minus black difference for both
the transition probability and the upward mobility along with confidence bands.
Among poorer families there are many blacks who exceed their parents rank in the
distribution but do not surpass them by enough to move across specific quantiles. As
discussed in section 3, the fact that the white distribution of parent income lies to the
17

While studentiziation may be preferable before bootstrapping for higher order refinements, it is

quite challenging to simulate the distribution stated in theorem 1 and so we simply use the unstudentized
version.

20

right of blacks over most of the support, makes it more likely that whites will surpass the
quantile thresholds more easily. This is illustrated in Figure 5 which plots the c.d.f.’s of
the parent income distribution for both blacks and whites. This implies that if blacks and
whites below the threshold experienced equal sized percentile gains, then the transition
probabilities would generally be higher for whites.18
The remaining columns of Table 2 show the comparable results as  varies from 0.1
to 0.3. In each case, the magnitude of the black white difference is generally between 15
and 25 percentage points and does not change too much as  changes. These results are
comparable to the upward transition probability results in Table 1 and suggest that the
two measures produce roughly similar results for larger values of  .
Thus far we have used progressively larger samples as we increase . This “cumulative”
approach could obscure patterns that might arise if we focused more finely on specific parts
of the income distribution. In addition, the fact that the white distribution lies to the
right of the black distribution suggests that blacks may have a built in advantage with
respect to upward mobility using cumulative samples since they have more “room” to
rise. To address this we recalculated measures by using non-overlapping ranges (1 to
2 ) for parent income that move progressively up the income distribution. Table 4 which
presents these results, demonstrates that much of the rapid upward mobility experienced
by blacks is concentrated at the very bottom of the distribution. For example, among
families between the 21st and 25th percentile, upward mobility is 28 percentage points
more rapid for whites than blacks. Overall, these results suggest that by most measures,
the extent of upward mobility among blacks is vastly lower than for whites.

6.2

Conditional probabilities

Estimates conditional on key covariates can potentially shed light on the underlying mechanisms behind the intergenerational transmission of economic status and the source of the
black-white mobility gap. As noted in section 1, previous studies have taken advantage
of the AFQT as a measure of cognitive skills to study black-white differences in levels.19
18

However, in other results (not shown) we also find that the magnitude of the percentile gains for

blacks are actually much lower than for whites.
19
Neal and Johnson (1996) have shown that the black-white wage gap among adults can largely be
explained by pre-market skills as proxied by AFQT scores during adolescence. O’Neill, Sweetman and
Van de Gaer (2006) show that equalization of cognitive skill gaps does not fully account for the blackwhite gap at the low end of the distribution. Cameron and Heckman (2001) have shown that the sizable

21

Unlike these previous studies, our measures of mobility capture movements in the distribution relative to the parent generation, so it is not obvious whether mobility gaps will be
eliminated the same way that level gaps are once we include AFQT scores as a covariate.
Another difference from previous work is that we account for test scores nonparametrically. We employ Nadaraya-Watson regressions. To do so, we first normalized the
regressor to lie between 0 and 1, using maximum and minimum possible values of the
AFQT viz.,99 and 1 and then estimated the regressions at 100 points with spacing of
001. We used an Epanechnikov kernel and chose the bandwidth  in accordance with
assumption NW4 above where  = 1.20 For inference, we calculated uniform bands using
the analytical formulas from Hardle (1990, algorithm 4.3.2), which are based on Bickel
and Rosenblatt (1973).21 The latter steps are reproduced in the appendix part D. Uniform, rather than pointwise, confidence bands are necessary because here we are making
inference on the entire conditional mobility curve as a function of the conditioning variable
AFQT to see how mobility differences vary with AFQT. Therefore we need data-based
bands which contain the entire true curves with at least a pre-assigned probability. Joining point-wise confidence limits will reduce the coverage probability arbitrarily below the
nominal level, leading to wrong confidence statements.
6.2.1

Conditional Transition Probabilities

We estimate the effect of AFQT scores on the probability of leaving the bottom quintile
separately by race. Figure 6 shows that conditional on AFQT scores, whites have only
slightly higher likelihood of exiting the bottom quintile and that this gap does not vary a
great deal across the AFQT distribution. For example at the 25th interval of our normalized AFQT scores, the transition probability for whites is 0.63 and for blacks is 0.61, or a
difference of just 2 percentage points. At the 10th interval the gap is about 7 percentage
points and at the 75th interval the gap is about 15 percentage points. At no point in the
gap in college enrollment between whites and blacks can largely be accounted for by AFQT scores.
20
We experimented with bandwidths around the range −14 (moving from −15 to −13 ), where 
denotes the size of the effective sample (this size varies depending on which parent percentile and race are
conditioned on). Our results for conditional mobility were quite stable over this range and so we report
the results at the  = −14 value.
21
Hardle (1990), theorem 4.3.1 justifies the validity of this algorithm. In the appendix, we show that
our conditions NW1-NW4, NW 5’ and NW 6 imply the sufficient conditions A1-A5 (page 116, Hardle,
1990) for this theorem to hold.

22

AFQT distribution can we reject the hypothesis that the transition probabilities are the
same. The shape of the regression lines are also similar between blacks and whites for
the bottom half of the distribution. In the upper half of the AFQT distribution, however,
the slopes differ and the lines fan apart. It is important to note however, that there is
relatively little data for blacks in the upper end of the AFQT distribution as is evidenced
by the widening confidence intervals.
In figure 7 we contrast the AFQT results with analogous results using the sons’ years
of education. We find sharp differences even conditional on education, throughout much
of the distribution. For example among those with 10 years of schooling, the transition
probability out of the bottom quintile for whites is 67 percent while for blacks it is just 45
percent. At the higher end of the education distribution, however, the racial gap converges
and at the very top of the distribution, black mobility is actually higher. Our confidence
intervals are large so although the differences are quite big over much of the distribution,
they are not statistically significant.22
Finally, the results from using our nonparametric approach are substantively different
from simply estimating a probit with AFQT as a covariate, i.e.,
Pr [1 ≤  1  0 ≤  0 | = ] = Φ ( 0 +  1 ×   ) .
This is particularly true for whites at the bottom of the distribution and for blacks at the
top of the distribution. In figure 8 we compare the transition probability results for whites
in the bottom of the distribution with the results from simply using a probit. As the chart
shows, moving from the first percentile of the AFQT distribution to the median nearly
doubles the transition probability of leaving the bottom quintile for whites from 0.43 to
0.85 when using the probit. In contrast, the non-parametric estimator implies an increase
of only about 27 percentage points from 0.52 to 0.79. This is not surprising because the
probit estimate at a point is affected by the outcome at far-off regressor values unlike the
nonparametric estimates.
6.2.2

Conditional Upward Mobility

We also estimate the effect of AFQT scores on our measure of upward mobility separately
by race. For this exercise, we condition on parent income being at or below the 20th
22

In similar exercises using measures of parent education (not shown) we find broadly similar results.

Hertz (2005) also found that parent education cannot explain the black-white mobility gap.

23

percentile and set  = 0. The results are shown in Figure 9. In this case the effects on the
black-white gap are even more striking as the point estimates imply that upward mobility
is virtually identical for blacks and whites in the bottom half of the distribution.

7

Concluding Thoughts

In this paper we develop new analytic tools that allow for an investigation of inter-racial
differences in IGM and its underlying sources. Using large intergenerational samples from
the NLSY, we document that upward transition probabilities for blacks in the bottom of
the income distribution are sharply lower than for whites. We introduce a new measure
of upward mobility that overcomes some of the limitations of the transition probability.
The new measure is simply the probability that the sons’ rank in the distribution exceeds
the parents’ rank in the prior generation. The baseline upward mobility measure shows a
much smaller inter-racial gap in IGM partially because it reflects the fact that many blacks
make small gains in rank over generations that are missed by the transition probability.
On the other hand if we adjust our upward mobility measure to require rank gains of a
certain amount then the two measures paint a more similar picture of low upward mobility
for blacks.
We also investigate, using non-parametric methods, how the inter-racial differences in
upward mobility are impacted by incorporating the effects of cognitive skills during adolescence as measured by AFQT scores. Remarkably, we find that AFQT scores can account
for virtually the entire black-white difference in upward mobility using either measure.
This suggests that early life interventions that address pre-market skills may be more
effective than policies that target labor markets institutions. It is not yet clear, however,
exactly how policy can best address the racial gap in children’s test scores. Cognitive
skills are influenced by a range of factors including health endowments, parental investment, peer influences and school quality. Recent work by Chay, Guryan and Mazumder
(2009) using military applicant data show that much of the apparent narrowing of the
black-white test score gap during the 1980s can be attributed to improvements in infant health arising from greater access by Southern blacks to hospitals during the 1960s.
Dobbie and Fryer (forthcoming) find that the Harlem Children’s Zone, which combines
community programs with charter schools, can significantly close black-white achievement
gaps. These results among others suggest that there is potential for policy to address the

24

sharp black-white differences in upward mobility highlighted here.
There are many other aspects of inter-racial differences in IGM which we have not
considered. For example, there may be important differences by gender. An analysis of
other covariates such as measures of health, family structure, wealth and non-cognitive
skills are also important areas for examination. For example, Heckman, Stixrud and
Urzua (2006) demonstrate the importance of non-cognitive skills (e.g. dependability,
persistence) on socioeconomic outcomes. We have also limited our outcome of interest to
labor market earnings and it may be fruitful to analyze patterns in mobility with respect
to other measures such as hours worked, wages and total family income. Finally, we
have limited our analysis to upward mobility but there may be important inter-racial
differences with respect to downward mobility as well.
The methodological innovations of the present paper were primarily motivated by
nonparametric empirical analysis of IGM; but they have potentially more general applicability. One can directly use our methods to the analysis of intragenerational upward
mobility, considered in Kopczuk, Saez and Song (2010), but by adding covariates or comparing differences by industry or occupation. A second potential application pertains to
estimating the probability that a high school student finishes in the top 10% of her high
school class as a function of covariates. Such students are offered guaranteed admission to
the flagship university by many US states, like Texas, Florida and California (c.f., Cullen
et al 2011). Note that such relative regressions, which implicitly control for inter-school
variations in absolute grades, are different from quantile regressions in that they capture
effects of covariates on the relative position in the marginal distribution of the dependent
variable and not the conditional (on the covariate) distribution. More generally, whenever
the parameter of interest is a nonparametric regression or a functional thereof but the
dependent variable involves preliminary components estimated from the same dataset,
possibly in a non-smooth way, the methods developed here will be relevant.

25

Table 1: Transition Probability Estimates by Race
θ = Prob(F1(Y 1 ) >s +τ , F0(Y 0 ) <s)/Prob (F1(Y 0 ) <s)
s

Whites

τ=0
Blacks

W-B

Whites

τ=0.1
Blacks

W-B

Whites

τ=0.2
Blacks

W-B

Whites

τ=0.3
Blacks

W-B

0.05

0.978
(0.030)

0.891
(0.025)

0.087
(0.041)

0.849
(0.057)

0.579
(0.043)

0.270
(0.073)

0.704
(0.070)

0.407
(0.044)

0.297
(0.084)

0.593
(0.084)

0.280
(0.040)

0.312
(0.093)

0.10

0.917
(0.030)

0.702
(0.027)

0.216
(0.042)

0.760
(0.046)

0.458
(0.030)

0.302
(0.055)

0.632
(0.053)

0.340
(0.028)

0.292
(0.061)

0.555
(0.054)

0.249
(0.025)

0.306
(0.059)

0.15

0.812
(0.030)

0.616
(0.026)

0.196
(0.042)

0.692
(0.035)

0.423
(0.026)

0.269
(0.045)

0.542
(0.037)

0.309
(0.023)

0.232
(0.046)

0.459
(0.038)

0.212
(0.020)

0.247
(0.046)

0.20

0.752
(0.028)

0.524
(0.025)

0.228
(0.041)

0.618
(0.033)

0.389
(0.025)

0.229
(0.044)

0.496
(0.035)

0.281
(0.022)

0.215
(0.041)

0.379
(0.037)

0.192
(0.019)

0.187
(0.043)

0 25
0.25

0.708
0
708
(0.025)

00.447
447
(0.024)

00.261
261
(0.036)

00.558
558
(0.026)

00.326
326
(0.021)

00.232
232
(0.035)

00.459
459
(0.027)

00.234
234
(0.019)

00.225
225
(0.034)

00.342
342
(0.029)

00.156
156
(0.017)

00.186
186
(0.035)

0.30

0.646
(0.024)

0.403
(0.020)

0.244
(0.034)

0.539
(0.026)

0.290
(0.018)

0.249
(0.033)

0.418
(0.024)

0.200
(0.016)

0.217
(0.028)

0.305
(0.021)

0.131
(0.013)

0.174
(0.025)

0.35

0.583
(0.023)

0.349
(0.020)

0.234
(0.034)

0.478
(0.023)

0.254
(0.018)

0.224
(0.032)

0.366
(0.022)

0.173
(0.015)

0.193
(0.028)

0.257
(0.020)

0.120
(0.013)

0.136
(0.024)

0.40

0.544
(0.019)

0.311
(0.018)

0.233
(0.029)

0.427
(0.019)

0.223
(0.017)

0.203
(0.027)

0.315
(0.018)

0.148
(0.013)

0.167
(0.024)

0.228
(0.018)

0.105
(0.011)

0.122
(0.022)

0.45

0.494
(0.017)

0.262
(0.017)

0.232
(0.024)

0.372
(0.018)

0.180
(0.015)

0.192
(0.025)

0.264
(0.016)

0.123
(0.012)

0.141
(0.020)

0.190
(0.014)

0.080
(0.011)

0.109
(0.020)

0.50

0.428
(0.015)

0.226
(0.015)

0.202
(0.023)

0.320
(0.016)

0.152
(0.012)

0.168
(0.021)

0.227
(0.015)

0.107
(0.011)

0.119
(0.020)

0.147
(0.011)

0.065
(0.009)

0.082
(0.015)

Notes: See text for a description of the estimator. Data are from the NLSY. We use multiyear averages of son's income over 1996-2002
and parent income measured over 1978-1980. Standard errors are in parentheses.

26

Table 2: Upward Mobility Estimates by Race
ν = Prob(F 1 (Y 1 )-F 0 (Y 0 )>τ| F 0 (Y 0 )<=s )
s

Whites

τ =0
Blacks

W-B

Whites

τ =0.1
Blacks

W-B

Whites

τ =0.2
Blacks

W-B

Whites

τ =0.3
Blacks

W-B

0.05

0.977
(0.024)

0.950
(0.018)

0.027
(0.033)

0.904
(0.047)

0.635
(0.044)

0.270
(0.066)

0.745
(0.065)

0.420
(0.045)

0.325
(0.083)

0.614
(0.073)

0.312
(0.040)

0.303
(0.084)

0.10

0.947
(0.022)

0.883
(0.022)

0.065
(0.032)

0.840
(0.035)

0.574
(0.032)

0.266
(0.051)

0.698
(0.047)

0.377
(0.031)

0.321
(0.059)

0.595
(0.053)

0.288
(0.025)

0.307
(0.061)

0.15

0.909
(0.021)

0.835
(0.020)

0.074
(0.029)

0.786
(0.031)

0.567
(0.027)

0.219
(0.042)

0.629
(0.040)

0.390
(0.025)

0.240
(0.047)

0.519
(0.040)

0.281
(0.025)

0.238
(0.048)

0.20

0.871
(0.021)

0.796
(0.017)

0.075
(0.027)

0.755
(0.029)

0.556
(0.024)

0.198
(0.039)

0.592
(0.030)

0.387
(0.022)

0.205
(0.037)

0.485
(0.032)

0.285
(0.020)

0.200
(0.039)

0 25
0.25

0.838
0
838
(0.021)

00.762
762
(0.019)

00.076
076
(0.030)

0.724
0
724
(0.024)

00.537
537
(0.024)

00.187
187
(0.038)

00.575
575
(0.028)

00.373
373
(0.024)

00.202
202
(0.036)

00.463
463
(0.028)

00.274
274
(0.019)

00.188
188
(0.034)

0.30

0.821
(0.018)

0.734
(0.019)

0.087
(0.027)

0.715
(0.021)

0.521
(0.021)

0.193
(0.033)

0.568
(0.026)

0.360
(0.020)

0.208
(0.036)

0.447
(0.025)

0.262
(0.019)

0.185
(0.035)

0.35

0.786
(0.019)

0.717
(0.017)

0.069
(0.026)

0.668
(0.020)

0.514
(0.023)

0.154
(0.030)

0.537
(0.021)

0.360
(0.019)

0.178
(0.031)

0.415
(0.023)

0.263
(0.016)

0.153
(0.028)

0.40

0.757
(0.018)

0.704
(0.016)

0.052
(0.025)

0.641
(0.017)

0.506
(0.020)

0.135
(0.028)

0.513
(0.020)

0.357
(0.019)

0.156
(0.027)

0.393
(0.019)

0.254
(0.018)

0.139
(0.027)

0.45

0.731
(0.015)

0.687
(0.017)

0.044
(0.024)

0.605
(0.021)

0.495
(0.021)

0.110
(0.032)

0.484
(0.019)

0.350
(0.018)

0.134
(0.028)

0.367
(0.019)

0.248
(0.017)

0.119
(0.026)

0.50

0.695
(0.014)

0.668
(0.018)

0.028
(0.025)

0.578
(0.016)

0.481
(0.020)

0.097
(0.028)

0.457
(0.016)

0.342
(0.018)

0.115
(0.025)

0.342
(0.017)

0.242
(0.015)

0.100
(0.024)

Notes: See text for a description of the estimator. Data are from the NLSY. We use multiyear averages of son's income over 1996-2002
and parent income measured over 1978-1980. Bootstrapped standard errors are in parentheses.

27

Table 3: Summary statistics of bootstrapped values of ν* -^ν

For s = 0.25

τ =0
τ = 0.1
τ = 0.2

mean
-0.009
0.000
0.001

median
-0.010
0.000
0.001

skewness
0.100
-0.091
0.118

kurtosis
2.787
2.700
2.916

p-value
skew test
0.550
0.587
0.483

p-value
p-value
kurt. test joint (chi sq)
0.637
0.746
0.417
0.618
0.982
0.780

p-value
skew test
0.097
0.695
0.918

p-value
p-value
kurt. test joint (chi sq)
0.297
0.143
0.285
0.519
0.131
0.314

For s = 0.5

τ =0
τ = 0.1
τ = 0.2

mean
-0.001
-0.001
0.001

median
-0.002
-0.002
0.001

skewness
0.283
0.066
-0.017

kurtosis
3.301
2.638
2.543

Notes: In all cases N = 200. p-values are from using "sktest" command in STATA v10.1

28

Table 4: Upward Mobility Estimates by Race Using Intervals of Parent Income
ν = Prob(F 1 (Y 1 )-F 0 (Y 0 )>τ| F 0 (Y 0 )>=s 1 & F 0 (Y 0 )<=s 2 )
s1 to s2

Whites

τ=0
Blacks

0.01 to 0.05

0.977
(0.024)

0.950
(0.018)

0.027
(0.033)

0.904
(0.047)

0.635
(0.044)

0.270
(0.066)

0.745
(0.065)

0.420
(0.045)

0.325
(0.083)

0.614
(0.073)

0.312
(0.040)

0.303
(0.084)

0.06 to 0.10

0.915
(0.048)

0.813
(0.035)

0.102
(0.059)

0.770
(0.067)

0.511
(0.048)

0.259
(0.083)

0.647
(0.079)

0.332
(0.043)

0.315
(0.093)

0.573
(0.079)

0.263
(0.035)

0.311
(0.090)

0.11 to 0.15

0.847
(0.047)

0.708
(0.051)

0.138
(0.070)

0.698
(0.062)

0.547
(0.053)

0.151
(0.083)

0.518
(0.075)

0.423
(0.051)

0.095
(0.093)

0.395
(0.068)

0.263
(0.050)

0.132
(0.089)

0.16 to 0.20

0.780
(0.058)

0.645
(0.048)

0.134
(0.079)

0.679
(0.067)

0.516
(0.053)

0.162
(0.089)

0.501
(0.070)

0.376
(0.050)

0.124
(0.082)

0.404
(0.066)

0.300
(0.049)

0.104
(0.087)

0 21 to
0.21
t 0.25
0 25

0.751
0 751
(0.052)

0.473
0 473
(0.070)

0.278
0 278
(0.092)

0.645
0 645
(0.058)

0.376
0 376
(0.062)

0.269
0 269
(0.089)

0.532
0 532
(0.057)

0.256
0 256
(0.060)

0.275
0 275
(0.082)

0.404
0 404
(0.058)

0.186
0 186
(0.056)

0.218
0 218
(0.083)

0.26 to 0.30

0.755
(0.049)

0.534
(0.059)

0.221
(0.077)

0.677
(0.061)

0.406
(0.061)

0.271
(0.088)

0.542
(0.065)

0.265
(0.051)

0.277
(0.083)

0.388
(0.062)

0.173
(0.050)

0.215
(0.077)

0 31 to 0.35
0.31
0 35

0 639
0.639
(0.072)

0 495
0.495
(0.073)

0 144
0.144
(0.104)

0 471
0.471
(0.063)

0 420
0.420
(0.075)

0 051
0.051
(0.098)

0 408
0.408
(0.060)

0 358
0.358
(0.076)

0 050
0.050
(0.102)

0 282
0.282
(0.061)

0 272
0.272
(0.071)

0 010
0.010
(0.104)

0.36 to 0.40

0.613
(0.055)

0.489
(0.092)

0.124
(0.113)

0.510
(0.061)

0.371
(0.090)

0.139
(0.117)

0.392
(0.056)

0.313
(0.087)

0.079
(0.113)

0.282
(0.053)

0.110
(0.068)

0.172
(0.090)

0.41 to 0.45

0.578
(0.060)

0.258
(0.096)

0.320
(0.116)

0.385
(0.071)

0.220
(0.090)

0.165
(0.111)

0.307
(0.063)

0.162
(0.088)

0.145
(0.094)

0.213
(0.047)

0.085
(0.057)

0.128
(0.072)

0.46 to 0.50

0.450
(0.055)

0.311
(0.080)

0.138
(0.094)

0.393
(0.064)

0.225
(0.079)

0.168
(0.114)

0.275
(0.053)

0.195
(0.068)

0.080
(0.089)

0.166
(0.041)

0.135
(0.064)

0.031
(0.071)

W-B

Whites

τ=0.1
Blacks

W-B

Whites

τ=0.2
Blacks

W-B

Whites

τ=0.3
Blacks

W-B

Notes: See text for a description of the estimator. Data are from the NLSY. We use multiyear averages of son's income over 1996-2002
and parent income measured over 1978-1980. Bootsrapped standard errors are in parentheses.

29

Figure 1: Transition Probabilities Conditional On Parent Percentile
1

Transition Probability

0.9

Whites

0.8
0.7
0.6
0.5
0.4

Blacks

0.3
0.2
0.1
0
5

10

15

20

25

30

35

40

45

50

Percentiles, s

Figure 2: Upward Mobility Conditional On Parent Percentile
Pr (F1>F0|F0<=s)
1

Whites

0.9
0.8

Blacks

Pr(F1>F0)

0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
5

10

15

20

25

30

35

40

45

50

Percentiles, s

^
Figure 3: Histogram of ν*-ν
40
35
Frequency

30
25
20
15
10
5
0

Notes for figures 1 to 3: See text for descriptions of the estimators. Data are from the NLSY. We use multiyear
averages of son's income over 1996-2002 and parent income measured over 1978-1980. For upward mobility
estimates, bootsrapped 95% pointwise confidence intervals are shown as bands.
30

Figure 4: Transition Probabilities vs Upward Mobility,
Whites - Blacks, Conditional On Parent Percentile
Trans Prob, Upward Mob

0.4
0.35

Transition Prob

0.3
0.25
0.2
0.15

Upward Mobility

0.1
0.05
0
5

-0.05

10

15

20

25

30

35

40

45

50

-0.1

Percentiles, s

Figure 5: CDF of Parent Income Conditional on Being in the
Bottom Quintile, Whites vs Blacks
1
0.9
0.8

CDF

0.7

Blacks

0.6
0.5

Whites

0.4
0.3
0.2
0.1
0
-12265

2900

4572

5805

7000

8292

Parent Income

Figure 6: Probability of Leaving Bottom Quintile
Conditional on AFQT, Whites vs Blacks

Probability

1.4

Whites

0.9

Blacks
0.4

-0.1

AFQT
-0.6

white

black

Notes for figures 4 to 6: Data are from the NLSY. We use multiyear averages of son's income over 1996-2002 and
parent income measured over 1978-1980. Uniform confidence intervals shown as dashed lines. For details of
computation see text.
31

Figure 7: Transition Probability of Leaving Bottom
Quintile Conditional on Ed., Whites vs Blacks
1.6
1.4

Probability
bility

1.2
1
0.8
08

Whites

0.6
0.4

Blacks

0.2
0

5

-0.2

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

Education

-0.4

white

black

Figure 8: Comparison of Probit and Non-Parametric
Estimates of Transition Probability of Whites Leaving
Bottom Quintile Conditional on AFQT
0.9

Probit

0.8

Probability

0.7

Non-Parametric

0.6
0.5
0.4
0.3
0.2
0.1
0
1

11

21

31

41

AFQT

Figure 9: Upward Mobility Conditional on AFQT Scores,
Whites vs Blacks
1.6
1.4

Probability
bability

1.2
1

Whites

0.8
0.6

Blacks

0.4
0.2

AFQT
0

white

black

Notes for figures 7 to 9: Data are from the NLSY. We use multiyear averages of son's income over 1996-2002 and
parent income measured over 1978-1980. For figures 7 and 9, uniform confidence intervals shown as dashed lines.
For details of computation see text.
32

Figure A1: Comparisons of Transition Probabilty with Upward Mobility, Lognormal Model

Panel A. τ = 0.05

transition
probability

0.99

upward
mobility

Probability

0.7425

difference
0.495
0.2475
0
‐0.2475
‐0.99

0.00
ρ

0.99

Panel B. τ = 0.25

transition
probability

0.99

upward
mobility

Probability

0.7425

difference
0.495
0.2475
2E‐16
‐0.2475
‐0.99

0.00
ρ

0.99

Panel C. τ = 0.5

transition
probability

0.99

upward
mobility

Probability

0.7425

difference
0.495
0.2475
2E‐16
‐0.2475
‐0.99

0.00
ρ

0.99

Notes : Figures are based on a simulation using a lognormal model. See text for futer details.

33

References
[1] Andrews, D. W. K. (1994): Empirical Process Methods in Econometrics, in Handbook of Econometrics, vol 4, pp. 2247-94, Elsevier.
[2] Becker, Gary S. and Nigel Tomes (1979): An Equilibrium Theory of the Distribution
of Income and Inter-generational Mobility," Journal of Political Economy, 87 (1979),
1153-1189.
[3] Bhattacharya, D (2007): Inference on inequality from household survey data, Journal
of Econometrics, Volume 137, Issue 2, Pages 674-707.
[4] Bickel, P. J., and M. Rosenblatt (1973): On some global measures of the deviations
of density function estimates, Annals of Statistics, pages 1071-1091.
[5] Bierens, H. (1994): Topics in advanced Econometrics, Cambridge University Press,
Cambridge, UK.
[6] Black Sandra and Paul Devereux (2010). "Recent Developments in intergenerational
mobility". Handbook of Labor Economics vol. 4, D. Card and O. Ashenfelter, eds.
Elsevier, Amsterdam.
[7] Bound, J., Brown, C., Duncan G.J., and Rodgers, W.L. (1994), "Evidence on the
Validity of Cross-Sectional and Longitudinal Labor Market Data", Journal of Labor
Economics, 12(3): 345-368.
[8] Cameron, Stephen V. and James J. Heckman (2001), “The Dynamics of Educational
Attainment for Black, Hispanic and White Males,” Journal of Political Economy
109:3, 455-499.
[9] Cascio, Elizabeth and Ethan Lewis, 2006, “Schooling and the Armed Forces Qualifying Test: Evidence from School-Entry Laws,” Journal of Human Resources, 41(2),
294-318.
[10] Chay, Kenneth, Jonathan Guryan and Bhashkar Mazumder, 2009, “Birth Cohort and
the Black-White achievement Gap: The Role of Health Soon After Birth,” working
paper, Federal Reserve Bank of Chicago.

34

[11] Corcoran Mary and Terry Adams (1997). “Race, Sex, and the Inter-generational
Transmission of Poverty", in Consequences of Growing up Poor, G.Duncan and J.
Brooks-Gunn, eds., Russell Sage Foundation (New York), pp. 461—517.
[12] Cullen J, , Mark C. Long, and Randall Reback (2011): Jockeying for Position: Strategic High School Choice Under Texas’ Top Ten Percent Plan, NBER Working Paper
No. 16663.
[13] Datcher, Linda (1981). "Race/Sex Differences in the Effects of Background on
Achievement." in Five Thousand American Families: Patterns of Economic Progress
Vol IX. Hill, Martha S.; Hill, Daniel H., and Morgan, James N., (eds.). Ann Arbor:
Institute for Social Research, University of Michigan; pp359-390.
[14] Dobbie, Will and Roland Fryer (forthcoming). "Are High-Quality Schools Enough
to Increase Achievement Among the Poor? Evidence from the Harlem Children’s
Zone", American Economic Journal: Applied Economics
[15] Formby, J., Smith W. J. & Zheng, B. (2004): Mobility measurement, transition
matrices and statistical inference, Journal of Econometrics, vol. 120, pp. 181-205.
[16] Fryer, Roland (2010). "Racial Inequality in the 21st Century: The Declining Significance of Discrimination" Handbook of Labor Economics vol. 4, D. Card and O.
Ashenfelter, eds. Elsevier, Amsterdam.
[17] Haider, Steven J. and Gary Solon. 2006. “Life Cycle Variation in the Association
Between Current and Lifetime Earnings.” American Economic Review 96(4): 13081320.
[18] Hansen, Karsten, James Heckman, and Kathleen Mullen, 2004, “The Effect of Schooling and Ability on Achievement Test Scores,” Journal of Econometrics, 121(1-2),
39-98.
[19] Hardle, W. (1990): Applied nonparametric regression, Cambridge University Press,
Cambridge, UK.
[20] Heckman, James, Stixrud Jora and Sergio Urzua (2006) "The Effects of Cognitive
and Noncognitive Abilities on Labor Market Outcomes and Social Behavior" Journal
of Labor Economics 24(3) 411-482.

35

[21] Hertz, Tom, (2005), “Rags, Riches and Race: The Intergenerational Economic Mobility of Black and White Families in the United States,” in Unequal Chances: Family
Background and Economic Success. Ed. Samuel Bowles, Herbert Gintis and Melissa
Osborne Groves. Princeton University Press.
[22] Hertz, Tom (2008), "A Group Specific Measure of Intergenerational Persistence",
Economic Letters 100(3): 415-417.
[23] Jencks, Christopher and Laura Tach. 2006. “Would Equal Opportunity Mean More
Mobility?” in Stephen Morgan and David Grusky (eds.) Mobility and Inequality:
Frontiers of Research from Sociology and Economics. CA: Stanford University Press.
[24] Kopczuk, Wojciech, Saez Emmanuel and Jae Song (2010). "Earnings Inequality and
Mobility in the United States: Evidence from Social Security Data Since 1937" Quarterly Journal of Economics, 125(1): 91-128.
[25] Lee, Chul-In, and Gary Solon. 2009. “Trends in Intergenerational Income Mobility.”
forthcoming, Review of Economics and Statistics.
[26] Neal, Derek A. and William R. Johnson (1996): "The Role of Premarket Factors in
Black-White Wage Differences," Journal of Political Economy 104:5, 860-895.
[27] O’Neill, Donal, Sweetman, Olive and Dirk Van de Gaer (2006): "The impacts of
cognitive skills on the distribution of the black-white wage gap." Labour Economics
13:343-356.
[28] O’Neill, Donal, Sweetman, Olive and Dirk Van de Gaer (2007). "The effects of
measurement error and omitted variables when using transition matrices to measure
inter-generational mobility." Journal of Economic Inequality 5:159-178.
[29] Pagan, A. and Aman Ullah (1999): Nonparametric Econometrics, Cambridge University Press.
[30] Powell, J. Stock, J and Stoker, T. (1989): Semiparametric Estimation of index coefficients, Econometrica, vol. 57, pp. 1403-30.
[31] Roemer, John E. (2004): "Equal opportunity and inter-generational mobility: going beyond inter-generational income transition matrices" in Generational Income

36

Mobility in North America and Europe, Ed. Miles Corak, Cambridge University
Press.
[32] Shorrocks (1978): The measurement of mobility, Econometrica, vol. 46, number 5,
pp. 1013-1024.
[33] Solon, Gary (1992), “Intergenerational Income Mobility in the United States,” American Economic Review 82(3), pp. 393-408.
[34] Solon, Gary (1999), "intergenerational mobility in the Labor Market," Handbook of
Labor Economics, Volume 3A, Orley C. Ashenfalter and David Card, eds. Elsevier
(1999), Amsterdam, North Holland.
[35] Swift, Adam (2005): "Justice, Luck and the Family" in Unequal Chances: Family
Background and Economic Success. Ed. Samuel Bowles, Herbert Gintis and Melissa
Osborne Groves. Princeton University Press.
[36] Trede, M (1998): “The Age profile of Mobility Measures: An Application to Earnings
in West Germany,” Journal of Applied Econometrics, 13, 397-409
[37] Van De Gaer, D. E. Schokkaert, M. Martinez (2001), “Three Meanings of Intergenerational Mobility ,” Economica, 272, 519-538.
[38] van der Vaart, A and Jon Wellner (1998): Asymptotic Statistics, Cambridge University Press.
[39] van der Vaart, A and Jon Wellner (2000): Weak Convergence and Empirical
Processes, Springer.

37

Appendix with Proofs
In the statements of the results and in their proofs,  will denote a generic positive constant not always having the same value and whenever moments, derivatives or Lebesgue
densities are defined, they are implicitly assumed to exist.
Part A:
Proposition 1: We now state a set of general regularity conditions which will imply
zero-mean asymptotic normality for the Nadaraya-Watson estimated regression of the
unobserved random variable
 := 1 {1 ≤  1  0 ≤  0 }
on , evaluated at  = . These conditions are standard (for textbook treatments, see
Bierens (1994), theorem 10.2.1 or Pagan and Ullah (1999) theorem 3.5, 3.6) but we state
them here to make the subsequent proposition and lemma statements self-contained.
Condition NW
NW1.  is a -dimensional continuously distributed random variable with Lebesgue
density  (·) which is positive at . For all our applications,  = 1.
NW2. The data (  1  0 ) are i.i.d.
NW3.  (·) is a Borel-measuarable, bounded and real-valued kernel function with
R
R
R
-dimensional argument, satisfying (i)  ()  = 1,  ()  = 0, 2  ()   ∞
R
R
for  = 1   , (ii) | () |  ∞, (iii) for some   0, | ()|2+   ∞.

NW4. The bandwidth sequence
³ ´   satisfies lim→∞   = 0, lim→∞   = ∞ and
¡
¢
12
is uniformly bounded for   ∈support().
lim→∞  2 
= 0; 1  −



NW5. The functions  (·) and  (·) ×  (·  0   1 ) and their derivatives up to order 2

are continuous and uniformly bounded.
Then we have

µ
¶
Z
´
¡  ¢12 ³ ³
¡ 0 0 ¢´ 
 2 ()
2
 
 ()  ,
̂  ̂ 0  ̂ 1 −    0   1 →  0
 ()
¡
¢ ¡
¡
¢¢
where  2 () =    00   01 × 1 −    00   01 .
Proof outline for proposition 1:

Proof. First note that the function  (  ) is Lipschitz with respect to the Euclidean
norm kk:

| (  0   1 ) −  (  0   1 )| ≤ k( 0   1 ) − ( 0   1 )k  ()

Appendix page1

(11)

with  () uniformly bounded on the support of . To see this, note that ∇ (  ) =

0 1 | ( |), so that applying the mean-value theorem to the LHS of (11) in conjunction

withe NW5 will yield the result.
Now consider the expression

¶
µ

1 X
 − 
̄ ( ) =
1 (1 ≤  1  0 ≤  0 )

  =1


whose expectation is given by
∗

So

µ

1

 

µ

¶
¶
 − 
 (   0   1 )


̄ ( ) = 
Z
=
 ()  ( +   )  ( +     0   1 ) 
¡ ¢
=  ()  (  0   1 ) +   2 .

(12)

³ ´
´
³
¡ ¢
̄∗ ̂  =  ()   ̂ 0  ̂ 1 +   2
³
´³
´ ⎤
⎡ ¡
¢
   00   01 + 0  ̃ 0  ̃ 1 ̂ 0 −  0
¡ ¢
³
´³
´
⎦ +  2 ,
=  () ⎣
+1  ̃ 0  ̃ 1 ̂ 1 −  1

where ̃ 1 denote value intermediate between ̂ 1 and  01 and similarly, ̃ 0 . Now,
´
³
¡
¢
̂  ̂ 0  ̂ 1 −    00   01
´ ³
´
³
P
 −
1
1


≤
̂


≤
̂
1
0
¢
¡
1
0
=1


=
−    00   01
ˆ ()
´
n ³
´
³
o
P
 −
1
1

−
1
(

≤
̂


≤
̂
≤



≤

)
1
0
1
0
1
0
1
0
=1


=
ˆ ()
´
³
⎫
⎧
P
⎨ 1  =1   − 1 (1 ≤  1  0 ≤  0 )
¢⎬
¡


+
−    00   01
⎭
⎩
ˆ ()
|
{z
}
= , say
³ ´i £ ¡
nh ³ ´
¢
¡ 0 ¢¤o
0
∗
∗
̄ ̂  − ̄ ̂  − ̄    − ̄   
=
ˆ ()
{z
}
|
1
³ ´
¢
¡
∗
̄ ̂  − ̄∗  0  
+
+ 
ˆ ()
{z
}
|
2

Appendix page2

(13)

¡
¢12
Now 
 , under the assumptions NW, will be  (1) and zero-mean normal (c.f.,

Bierens (1994), theorem 10.2.1), viz.,

µ
¶
Z
¡  ¢12
2 ()

2
 ()  .
 
 →  0
(14)
 ()
´
³
´
³
Next, using (12) and the fact that ̂ 0 −  0 and ̂ 1 −  1 have parametric conver-

gence rates, we get that

=

¡  ¢12

2
³
´ ¡
´
³
⎡
¢
 12
  ̃ 0  ̃ 1 × 
× ̂ 0 −  0
⎢ 0
|
{z
}
⎢
⎢
 (1)³
³
´
´
 () ⎢
¡
¢
⎢ +1  ̃ 0  ̃ 1 ×  12 × ̂ 1 −  1
⎣
|
{z
}
 (1)

ˆ ()
| {z }

⎤

⎥
⎥
³¡
´
¢12
⎥
×  2
⎥ +  
⎥ |
{z
}
⎦
 (1), by NW4

= (1)

=  (1) .

(15)

The nonstandard term in (13) is 1 and we now demonstrate a stochastic equicontinuity
property for it. Letting

the first term satisfies

¡
¢12
 ( ) = 
{̄ ( ) − ̄∗ ( )}
¡  ¢12
1 =


³ ´
¢
¡
 ̂  −   0  
ˆ ()

,

(16)

and we now show that the numerator of (16) is  (1), for each . Now, for fixed , the
class of functions
1
 ( ) :=  


µ

¶
 −
1 (1 ≤  1  0 ≤  0 )


form a type IV class (c.f. Andrews (1994), equation 5.3) with  =³ 2. This
´ follows from
. This, in turn,
the Lipschitz property (11) and the uniform boundedness of 1  −



implies that the sequence  ( ) is stochastically equicontinuous. Now, using the same
steps as Andrews (1994) leading to his equation (3.8), we conclude that
³ ´
¢
¡
 ̂  −   0   =  (1) .

Appendix page3

Put together (13), (14), (15) and (16) together with ˆ () =  (1) to conclude.
Part B:
Hadamard differentiability of upward mobility: Let  (0  1 ) and  (0) (0  1 )
denote respectively the joint density of (0  1 ) and its derivative w.r.t. the first argument,
evaluated at the point (0  1 ). Let 1 (1 ) denote the marginal density of 1 and let
 denote a generic positive constant. Since all our measures are robust to monotone
transformation of the income variables, we will continue assume that the support of the
income variables is contained in [1 ∞).
Condition : (Ai) for some   1, we have 1 () ≥  for  large enough which
¢
¡
1
also implies that 1−1 ()   (1 − ) 1− , (Aii)  (0) 0  1−1 (0 (0 ) +  ) ≤  0 for some
(1+−0 )(−1)


0

23

and (Aiv)
0  0 , (Aiii) for some   0, 1 − 0 (0 )  0
Z ∞
¡
¢

(1 − 0 (0 )) −1  0  1−1 (0 (0 ) +  ) 0  ∞
1

It is interesting to note that if the tails of (0  1 ) have a joint Pareto distribution,

then all of these conditions are automatically satisfied. To see this, assume that (0  1 )
satisfy
Pr (0 ≥ 0  1 ≥ 1 ) =

1
(1 + (0 − 1) + (1 − 1))

for all 0  1 ≥ 1 for some   0. Then their joint density is given by
 (0  1 ) =

 ( + 1)
.
(1 + (0 − 1) + (1 − 1))+2

Then one may verify that conditions A(i)-A(iv) are satisfied with  =  + 1, 0 =  + 2
and  = 1 +  +  ( + 1).
An exactly symmetric set of conditions are assumed to hold for the marginal density
0 () of 0 as well.
(Bi) for some   1, we have 0 () ≥  for  large enough which also implies that
¡
¢
1
0−1 ()   (1 − ) 1− , (Bii)  (0) 0−1 ()  1 ≤ 0 for some  0  0 , (Biii) for some
(1+− 0 )(−1)


1

and
  0, we have 1 − 1 (1 )  1
Z ∞
¡
¢

(1 − 1 (1 )) −1  0−1 ()  1 1  ∞.
(Biv)
1

23

Condition A(iv) is like a moment condition; recall that for a positive random variable  with
R
marginal c.d.f.  () and support , the quantity  (1 −  ())  equals  ().

Appendix page4

To show that the map  7→  ( ) is Hadamard-differentiable, let ̄[1 ∞) denote

the space of bivariate c.d.f.’s on [1 ∞), satisfying conditions (Ai)-(Aiv) and (Bi)-(Biv).

Denote by 0 the space of sample paths corresponding to the composite Brownian bridge

[G ◦  ] where  is a standard Brownian bridge and  is any c.d.f. in ̄[1 ∞). Let

 = ̄[1 ∞) ∪ 0 , equipped with the supremum norm. We want to show Hadamard
differentiability of the map  7→  ( ) as a map from the normed vector space  to R.

Consider perturbations  (0  1 ) =  (0  1 )+ (0  1 ) ∈ ̄[1 ∞) with  →  ∈ 0 ,
uniformly as  → 0. We want to show that
¯
¯
¯
¯  ( ) −  ( )
0
¯
−   ()¯¯ → 0 as  → 0
¯

for a linear functional 0 () which is a map from ̄[1 ∞) to .

Lemma 1 Under conditions (Ai)-(Aiv) and (Bi)-(Biv), the map  7→  ( ) from  → R,
defined as

 ( ) =

Z

0−1 ()

1

Z

1−1 (0 (0 )+ )

 (0  1 ) 1 0

1

for any fixed ,  ∈ (0 1) is Hadamard differentiable at  tangentially to 0 . The
derivative at  in the direction  is given by the linear functional  0 () defined as
¡ −1 ¢ Z  −1 ( ( )+ )
0 0
1
¡
¢
0 ()

0
¢
¡ −1
 0−1 ()  1 1
 0 () =
0 0 () 1
¢
¡
Z 0−1 ()
¢
0 (0 ) − 1 1−1 (0 (0 ) +  ) ¡
¢
¡ −1
+
 0  1−1 (0 (0 ) +  ) 0
1 1 (0 (0 ) +  )
1
Z 0−1 () Z 1−1 (0 (0 )+ )
+
 (0  1 ) ,
1

1

where
0 () = lim  ( ) and 1 () = lim  ( ) .
→∞

→∞

(17)

Proof. Consider perturbations  (0  1 ) =  (0  1 ) +  (0  1 ) with 0 (0 ) =
0 (0 )+0 (0 ) and 1 (0 ) = 1 (1 )+1 (1 ) denoting the corresponding marginals.
Let  →  uniformly as  → 0 and let 0 and 1 denote its marginals. We want to

show that for a linear functional 0 (),
¯
¯
¯  ( ) −  ( )
¯
0
¯
¯ → 0 as  → 0.
−

()

¯
¯


Appendix page5

(18)

Define
1 (0 ) = 1−1 (0 (0 ) +  ) , 1 (0 ) = 1−1 (0 (0 ) +  )
0 = 0−1 () , 0 = 0−1 () .
So we need to show
¯ R  R  ( )
¯
¯ 0 1 0  (   ) − R 0 R 1 (0 )  (   )  
¯

0 1
0 1
1
0
¯ 1 1
¯
0
1
1
−  ()¯ → 0 as  → 0.
¯
¯
¯

Note that the first term inside || can be expanded as

Z 1 (0 )
0 − 0
1 (0 ) − 1 (0 )
 (̄1 (0 )  0 ) 0 +
×
 (̄0  1 ) 1


1
1
Z 0 Z 1 (0 )
+
 (0  1 )
1
1
⎡ Z 0
⎤
1 (0 ) − 1 (0 )
[ (0  ̄1 (0 )) −  (0  1 (0 ))] 0 ⎥
⎢

⎢ |1
{z
} ⎥
⎢
⎥
⎢
⎥
Z 1 (0 ) 10
= ⎢
⎥
0 − 0
⎢
⎥
⎢
⎥
×
+
[ (̄0  1 ) −  (0  1 )] 1
⎣
⎦

1
|
{z
}
11
{z
}
|
Z

0

1

Z 1 (0 )
Z 0
0 − 0
1 (0 ) − 1 (0 )
 (0  1 ) 1 +
×
+
 (0  1 ) 1


1
1
|
{z
} |
{z
}
2

Z 0 Z 1 (0 )
Z 0 Z 1 (0 )
+
 (0  1 ) −
 (0  1 )
1
1
1
1
{z
}
|

3

4

Z 0 Z 1 (0 )
Z 0 Z 1 (0 )
Z 0 Z 1 (0 )
+
 (0  1 ) −
 (0  1 ) +
 (0  1 ).
1
1
1
1
1
1
{z
} |
{z
}
|
5

6

We will show that as  → 0,
Step 1: |1 | → 0

Step 2:

0 (0 )
2 →
0 (0 )

Z

1 (0 )

1

Appendix page6

 (0  1 ) 1

Step 3:
3 →

Z

1

0

0 (0 ) − 1 (1 (0 ))
 (0  1 (0 )) 0
1 (1 (0 ))

Step 4: |4 | → 0, |5 | → 0.

Then we will have shown (18) with
¢ Z  −1 ( ( )+ )
¡
0 0
1
¡
¢
0 0−1 ()
0
¡ −1 ¢
  () =
 0−1 ()  1 1
0 0 () 1
¢
¡
Z 0−1 ()
¢
0 (0 ) − 1 1−1 (0 (0 ) +  ) ¡
¢
¡ −1
+
 0  1−1 (0 (0 ) +  ) 0
1 1 (0 (0 ) +  )
1
Z 0−1 () Z 1−1 (0 (0 )+ )
+
 (0  1 )
1

1

which is linear in .
For steps 1 and 2, we will need the following derivation.
1 (1 (0 )) + 1 (1 (0 )) = 1 (1 (0 )) = 0 (0 ) + 
= 0 (0 ) +  + 0 (0 ) = 1 (1 (0 )) + 0 (0 )
implying that
1 (1 (0 )) − 0 (0 ) = 1 (1 (0 )) − 1 (1 (0 ))
= [1 (0 ) − 1 (0 )] 1 (̃1 (0 ))
where for any 0 and , ̃ (0 ) lies in between  (0 ) and  (0 ). Therefore,
1 (0 ) − 1 (0 ) 0 (0 ) − 1 ( (0 ))
=
.

1 (̃1 (0 ))

(19)

Similarly, 0 (0 ) =  = 0 (0 ) = 0 (0 ) + 0 (0 ), whence
0 (0 )
0 − 0
=
.

0 (̃0 )
Below,  will denote a generic constant, not always of the same value.
Step 1: By a mean-value theorem argument,
¯
Z 0 ¯
¯
¯ 1 (0 ) − 1 (0 )
¯ 0
¯
|10 | ≤
[
(

̄
(
))
−

(


(
))]
0
1
0
0
1
0
¯
¯

1
¯
¯
Z 0 ¯
2
¯
¯ [ (0 ) −  (0 )] (1)
¯
 (0  ̆1 (0 ))¯ 0
≤
¯
¯
¯

1

Appendix page7

(20)

where  (1) ( ) denotes derivative w.r.t. the second argument and ̆1 (0 ) lies in between
 (0 ) and  (0 ). Using (19), we get
¯
Z ∞ ¯¯
2
¯
¯ [0 (0 ) − 1 (1 (0 ) =)] (1)
¯

(

̆
(
))
|10 | ≤ 
¯
0 1
0 ¯ 0 .
2
¯
1 (̃1 (0 ))
1 ¯

We will show that (i) [0 (0 ) − 1 (1 (0 ))]2 is uniformly bounded, (ii) 12 (̃1 (0 )) ≥
2


≥  (1 − 0 (0 )) 1− for 0 large enough and
̃1 (0 )2
≤  0 for some 0  1. Then we will have
0

|10 | ≤ 

Z

1

by A(iii).

∞

 small enough and (iii)  (1) (0  ̆1 (0 ))

¯
¯
Z ∞
¯
¯
1
1
¯
¯
0 → 0,
¯ 
2 ¯ 0 ≤ 
1+
¯  0 (1 − 0 (0 )) 1− ¯

1
0
0

To see (i), note that {[0 (0 ) − 1 (1 (0 ))] − [0 (0 ) − 1 (1 (0 ))]} converges

uniformly to 0 and 0 () and 0 () are uniformly bounded.
Next,
1 (0 )
(1)

1

1

= 1−1 (0 (0 ) +  )   (1 − 0 (0 ) −  ) 1− =  (1 − 0 (0 ) − 0 (0 ) −  ) 1−
1

≤ 0 (1 − 0 (0 ) −  ) 1−
1

≤  (1 − 0 (0 )) 1−

(21)

for small enough , since   1 and 0 () converges uniformly to 0. Inequality (1) comes
from condition Ai. Similarly,
1

1 (0 ) ≤  (1 − 0 (0 )) 1−

(22)

and therefore (ii) follows. Finally (iii) follows from (21), (22) and condition Aii.
Next, for |11 |, we have that
¯
¯
Z 1 (0 )
¯
¯ − 
¯
¯ 0
0
×
[ (̄0  1 ) −  (0  1 )] 1 ¯
|11 | ≤ ¯
¯
¯ 
1
¯
¯
¯
¯ [ −  ]2 Z 1 (0 )
¯
¯ 0
0
(0)
×
 (̄0  1 ) 1 ¯
≤ ¯
¯
¯

1
¯
¯∙
¸
Z ∞
¯ (2) Z 1 (0 ) 1
¯  ( ) 2 Z 1 (0 )
1
¯
¯ 0 0
(0)
×
 (̄0  1 ) 1 ¯ ≤ 
1 ≤ 
1
≤ ¯
1+
1+
¯
¯ 0 (̃0 )
0
0
1
1
1

Appendix page8

for  small enough and some   0. Inequality (2) follows from conditions (Bi), (Bii),
(Biii) using arguments analogous to those for 10 . This implies that |11 | → 0.

Step 2:
¯
¯
Z
¯
¯
0 (0 ) 1 (0 )
¯
¯
 (0  1 ) 1 ¯
¯2 −
¯
¯
0 (0 ) 1
¯
¯Z
Z
¯
¯ 0  ( ) −  ( )
0 (0 ) 1 (0 )
¯
¯
1
0
1
0
 (0  1 ) 0 −
 (0  1 ) 1 ¯
= ¯
¯
¯ 1

0 (0 ) 1
¯
¯Z
Z
¯
¯ 0  ( )
0 (0 ) 1 (0 )
¯
¯
0
0
 (0  1 ) 1 −
 (0  1 ) 1 ¯
= ¯
¯
¯ 1
0 (̃0 )
0 (0 ) 1
¯
Z 0 ¯
¯ 0 (0 ) 0 (0 ) ¯
¯
¯
≤
¯ 0 (̃0 ) − 0 (0 ) ¯  (0  1 ) 1
1
Z 0
(1)

≤
|0 (0 ) − 0 (0 )| (1 − 1 (1 )) −1  (0  1 ) 1
µ 1
¶Z ∞

≤  sup |0 () − 0 ()| + |0 (0 ) − 0 (0 )|
(1 − 1 (1 )) −1  (0  1 ) 1


1

→ 0

as  → 0, by B(iv). Inequality (1) is a consequence of B(i)-B(iii).

Step 3:
¯
¯
Z 0
¯
¯

(
)
−

(
(
))
0
0
1
1
0
¯3 −
¯

(


(
))

0
1
0
0
¯
¯
1 (1 (0 ))
1
¯
¯
R
0 1 (0 )−1 (0 )
¯
¯
 (1 (0 )  0 ) 0
¯
¯

1
= ¯ R 0 0 (0 )−
¯
(
(
))
1
1
0
¯
¯ −

(


(
))

0
1
0
0
1 (1 (0 ))
1
¯
Z 0 ¯
¯ 1 (0 ) − 1 (0 ) 0 (0 ) − 1 (1 (0 )) ¯
¯
¯  (0  1 (0 )) 0
−
≤
¯
¯

1 (1 (0 ))
1
¯
Z ∞¯
¯ 0 (0 ) − 1 (1 (0 )) [0 (0 ) − 1 (1 (0 ))] ¯
¯
¯  (0  1 (0 )) 0
−
=
¯
¯
1 (̃1 (0 ))
1 (1 (0 ))
1
(
)
Z ∞
(0)
|0 (0 ) − 1 (1 (0 )) − [0 (0 ) − 1 (1 (0 ))]|
0
≤

× (1 − 0 (0 )) −1  (0  1 (0 ))
1
)
(
sup0 |0 (0 ) − 1 ( (0 )) − [0 (0 ) − 1 ( (0 ))]|
≤ 
R∞

× 1 (1 − 0 (0 )) −1  (0  1 (0 )) 0
R∞

which goes to zero if 1 (1 − 0 (0 )) −1  ( (0 )  0 ) 0  ∞, which is condition (Aiv).
(0)

Note that the inequality ≤ follows from step (ii) in the proof of Step 1, above. Finally,
R
1 (1 (0 ))
since 1 0 0 (01)−
 (0  1 (0 )) 0 is continuous in 0 , the conclusion follows.
(1 (0 ))

Appendix page9

Step 4:
4 → 0 since  →  uniformly and 5 goes to zero by the continuous mapping

theorem since paths of an  -Brownian bridge are everywhere continuous with probability
1.
Functional delta method and proposition 3: Since
from lemma 1 and the functional delta method that

´
√ ³
 ̂ −  Ã G, we have

√

 (̂ − 0 ) → 0 (G) ,
whence 0 (G) will be distributed as a univariate zero mean normal given by
¢ Z  −1 ( ( )+ )
¡
0 0
1
¡
¢
G0 0−1 ()
0
¡ −1 ¢
 0−1 ()  1 1
 (G) =
0 0 () 1
¡
¢
Z 0−1 ()
¢
G0 (0 ) − G1 1−1 (0 (0 ) +  ) ¡
¢
¡ −1
+
 0  1−1 (0 (0 ) +  ) 0
1 1 (0 (0 ) +  )
1
Z 0−1 () Z 1−1 (0 (0 )+ )
+
G (0  1 ) ,
1

1

where G0 and G1 are stochastic processes defined from G, analogous to (17), e.g., G0 () is

a univariate normal with mean zero and variance 0 () × [1 − 0 ()]. Now we can apply

the functional delta method argument to justify consistency of the bootstrap via van der
Vaart and Wellner (1996), theorem 3.9.11.
Part C:
Proposition 3: Recall that
  (  ; ) = Pr (1 (1 ) − 0 (0 )   |0 (0 ) ≤   = )
 (0  1  )
Pr (1 (1 ) − 0 (0 )    0 (0 ) ≤ | = )
:=
,
=
Pr (0 (0 ) ≤ | = )
 (0  )
is estimated by
̂  (  ; )
´ ³
´
³
´
³
P
P
 −
 −
1
1
1

(
)
−
̂
(
)



̂
(
)
≤



̂
1
1
0
0
0
0
=1
=1


 

´ ³
´
³
´
³
=
P
P

 −
1
1 ̂0 (0 ) ≤   1  =1   −
=1 
 



³
´
̂ ̂0  ̂1  
³
´ ,
: =
̂ ̂0  

Appendix page10

where for  = 0 1, ̂ ( ) =

1
−1

P

6=

1 ( ≤  ) and  () is a -dimensional kernel

function with a bandwidth sequence   , satisfying the NW conditions specified in the
appendix. The object whose distribution is needed is given by:
³
´
̂ ̂0  ̂1  
 (0  1  )
³
´ −
.
̂ (  ; ) −  (  ; ) =
 (0  )
̂ ̂0  
Let

ˆ ()

̂ (0  1  )

̂ (0  )



¶
µ

 − 
1 X

=
  =1

´
³
P

 −
1
1 (1 (1 ) − 0 (0 )    0 (0 ) ≤ )
=1 


=
,
ˆ ()
´
³
P
 −
1
1 (0 (0 ) ≤ )
=1 


=
,
ˆ ()
: = 1 (1 (1 ) − 0 (0 )    0 (0 ) ≤ ) ,
= 1 (0 (0 ) ≤ ) .

Then
 ()

³
´
= ̂ ̂0  ̂1   −  (0  1  )
n
o n ³
´
o
= ̂ (0  1  ) −  (0  1  ) + ̂ ̂0  ̂1   − ̂ (0  1  ) .
{z
} |
{z
}
|
:

̃1 ()

̃2 ()

We will show that ̃2 () is of smaller order of magnitude than ̃1 (). This will imply that asymptotically, the distribution of  will be that of ̃1 which is simply the
Nadaraya-Watson regression of the unobserved random variable  on , evaluated at
 = , denoted by ̂ ( | = ). The formal result is stated below and its proof appears

in the
under theorem 2. An exactly analogous result holds for the denominator
³ appendix
´
̂ ̂0   .

The following additional assumption is used.

Assumption NW5’. The functions  (),  ( | = ) and  () ×  ( | = )

are twice differentiable and the functions and their derivatives up to order 2 are continuous
and uniformly bounded.  has bounded support with density bounded away from zero.

Appendix page11

Claim: Suppose the data (  1  0 ) for  = 1 ,are i.i.d. and assumptions NW14, NW5’ hold. Then, we have that
¡  ¢12

̃2 () =  (1) .

Proof of claim
3:. Let 2 () = ̃2 () ˆ ().
will
³¡ We
´
³pand thus proposition
´
¢
−12
now show that 
  |2 ()| → 0 which will imply that |2 ()| =  
and thus establish the result since ˆ () is bounded in probability by assumption.
First observe that
⎡¯ ³
´ ¯¯⎤
¯
¯ 1 ̂1 (1 ) − ̂0 (0 )    ̂0 (0 ) ≤  ¯
¯⎦
 ⎣¯¯
¯
¯ −1 (1 (1 ) − 0 (0 )    0 (0 ) ≤ ) ¯
⎡¯ ³
⎤
´ ¯¯
¯
¯ 1 ̂1 (1 ) − ̂0 (0 )    ̂0 (0 ) ≤  ¯
¯ 6= 0⎦
= Pr ⎣¯¯
¯
¯ −1 (1 (1 ) − 0 (0 )    0 (0 ) ≤ ) ¯
⎡ n
o ⎤
̂1 (1 ) − ̂0 (0 )    ̂0 (0 ) ≤ 
⎦
= Pr ⎣

∩ (1 (1 ) − 0 (0 )    0 (0 ) ≤ )
⎡ n
o ⎤
̂1 (1 ) − ̂0 (0 )    ̂0 (0 ) ≤ 
⎦
+ Pr ⎣
∩ (1 (1 ) − 0 (0 )    0 (0 ) ≤ )
o
i
hn
≤ Pr ̂1 (1 ) − ̂0 (0 )    ̂0 (0 ) ≤  ∩ (1 (1 ) − 0 (0 ) ≤  )
o
i
hn
+ Pr ̂1 (1 ) − ̂0 (0 )    ̂0 (0 ) ≤  ∩ (0 (0 )  )
³
´i
h
+ Pr {1 (1 ) − 0 (0 )    0 (0 ) ≤ } ∩ ̂1 (1 ) − ̂0 (0 ) ≤ 
³
´i
h
+ Pr {1 (1 ) − 0 (0 )    0 (0 ) ≤ } ∩ ̂0 (0 )  
i
h
≤ Pr ̂1 (1 ) − ̂0 (0 )    1 (1 ) − 0 (0 ) ≤ 
o
n
+ Pr ̂1 (1 ) − ̂0 (0 ) ≤   1 (1 ) − 0 (0 )  
i
h
i
h
+ Pr ̂0 (0 ) ≤  0 (0 )   + Pr 0 (0 ) ≤  ̂0 (0 )   .

Appendix page12

Therefore,
³p
´

  |2 ()|
¯
⎛
´ ¯¯⎞
¶¯ ³
µ

X
 −  ¯¯ 1 ̂1 (1 ) − ̂0 (0 )    ̂0 (0 ) ≤  ¯¯⎠
1
 ⎝
= p
¯
¯

  =1
¯ −1 (1 (1 ) − 0 (0 )    0 (0 ) ≤ ) ¯
⎫
⎧
³
´
 −
⎪
⎪

⎪
⎪

⎪
⎪

⎬
⎨
¯
¯
⎛
⎞
´
³
1 X
¯
¯
 0 1
= p
¯ 1 ̂1 (1 ) − ̂0 (0 )    ̂0 (0 ) ≤  ¯
⎪
¯ |  0  1 ⎠ ⎪
  =1
× ⎝¯¯
⎪
⎪
⎪
⎪
¯
⎭
⎩
¯ −1 (1 (1 ) − 0 (0 )    0 (0 ) ≤ ) ¯
( µ
))
!
¶ (Ã

̂1 (1 ) − ̂0 (0 )   
1 X
 − 
≤ p
|  0  1
Pr
 0 1 

  =1
1 (1 ) − 0 (0 ) ≤ 
( µ
))
!
¶ (Ã

̂1 (1 ) − ̂0 (0 ) ≤  
 − 
1 X
+p
|  0  1
Pr
 0 1 

  =1
1 (1 ) − 0 (0 )  
¶ n
½ µ

o¾
1 X
 − 
+p
Pr ̂0 (0 ) ≤  0 (0 )  |  0
 0  

  =1
¶ n
½ µ

o¾
 − 
1 X
Pr ̂0 (0 )   0 (0 ) ≤ |  0
 0  
+p

  =1
(23)
: = 1 + 2 + 3 + 4 , say.
We will show that 1 → 0 and an exactly analogous proof will show that 2  3  4
are also  (1).

Now, for fixed   0  1 and the fact that e.g. ̂1 (1 ) =

1
−1

have that

P

6=

1 ( ≤ 1 ), we

´
³
Pr ̂1 (1 ) − ̂0 (0 )    1 (1 ) − 0 (0 )   |  0  1
!
Ã
̂1 (1 ) − ̂0 (0 ) − (1 (1 ) − 0 (0 ))   − (1 (1 ) − 0 (0 )) 
= Pr
1 (1 ) − 0 (0 )   |  0  1
¢
¡
≤ exp −2 ( − 1) ( − (1 (1 ) − 0 (0 )))2 × 1(1 (1 ) − 0 (0 )   ),

by Hoeffding’s inequality (note that conditional on 1 , ̂1 (1 ) =

1
−1

P

6=

1 ( ≤ 1 ) is

an average of independent, binary (0 1) random variables, thus satisfying the hypothesis

Appendix page13

of Hoeffding’s inequality). Thus, we have that
1

⎡



³

 −


´

⎤


⎢
⎥
1 X
¡
2¢ ⎥
p
≤
 0 1 ⎢
(
)
−

(
)))
×
exp
−2
(
−
1)
(
−
(
1
1
0
0
⎣
⎦
  =1
×1(1 (1 ) − 0 (0 )   )
⎡ ³
⎤
´
¡
2¢
−
  exp −2 ( − 1) ( − (1 (1 ) − 0 (0 )))

⎦
0 1 ⎣
= p
 
×1(1 (1 ) − 0 (0 )   )
∙ µ
¶
¸

 −
= p
 () , where
 

 
"
¡
¢ #
exp −2 ( − 1) ( − (1 (1 ) − 0 (0 )))2
.
 () = 0 1 |
×1(1 (1 ) − 0 (0 )   )| = 

Continuing with the previous display, we have
∙ µ
¶
¸
 −

1 ≤ p
 ()
 

 
Z
 
= p
[ ()  ( +   ) ( +   )] 
 
Z
p

  [ ()  ( +   ) ( +   )] 
=
Z
p
=  ()    ()  () + terms of smaller order
p
=  ()    () + terms of smaller order.

(24)

Now, notice that defining  =  − (1 (1 ) − 0 (0 )),  () is of the form
£
¡
¢
¤
 () = | exp −2 ( − 1)  2 × 1(  0)| = 
Z
¢
¡
≤  exp −2 ( − 1)  2  (|) 
Z
¡
¢
0
≤ 
exp −2 ( − 1)  2 
¢
¡
=  −12

by the normal (Gaussian) integral formula. From (24) and (25), it follows that
´
³p ´
³p
´
³
p
−12



  |2 ()| =  
×   = 
  =  (1) .

Appendix page14

(25)

Together with analogous proofs for 2  3  4 , this implies that
which is the desired result.

p
  2 () =  (1),

Part D:
Uniform confidence bands for conditional mobility (following Hardle (1990),
algorithm 4.3.2):
First note that the inequality in step 3 of (25) is uniform in  because  has bounded

support. Consequently the "terms of smaller order" in (24) are also small uniformly in .
p
Given that  (·) is also uniformly bounded by NW5’, the conclusion   2 () =  (1)
p
¡
¢12
can be strengthened to sup   |2 ()| =  (1). This shows that 
[̂ (  ; ) −  (  ; )
as an empirical process, is asymptotically equivalent to linear combinations of the NadarayaWatson regression residual processes
o
¡  ¢12 n
 
̂ ( | = ) −  ( | = ) , and
o
¡  ¢12 n

̂ ( | = ) −  ( | = ) ,

which converge weakly to zero-mean Gaussian processes. Consequently, they are amenable
to the treatment in Hardle (1990), based on Bickel and Rosenblatt (1973), which develops
uniform confidence bands for NW regression curves. We now outline Hardle’s construction.
For each sample value  of the conditioning variable , bandwidth   and kernel
 (·), denote estimated density at  =  by
¶
µ

1 X
 − 
ˆ
 () =
.

  =1

³
´
Consider dependent variables  = 1 ̂1 (1 ) − ̂0 (0 )    ̂0 (0 ) ≤  for up´
³
ward mobility and  = 1 1 ≤ ̂ 1  0 ≤ ̂ 0 for transition probability. Denote regres-

sion estimate (predicted value) at  =  by
P
1
̂ () =



=1 

³

 −


ˆ ()

´


.

Corresponding to Epanechnikov kernel, set
Z 1
Z 1
¢2
9 ¡
2
1 − 2  = 35 = 06
 ()  =
 =
−1
−1 16
ª2
R1 © 0
 () 
2 = −1
= 125
2

Appendix page15

•
=
•

µ

1
2 ln


¶

sµ
µ ¶¶
¶
µ
1
1
2
2 ln
+ r³
=
³ ´´ ln 2 2

2 ln 1
2
µ
¶
2
1
ln
= +
2
2 2



•
•

s

 = − ln (−05 × ln (1 − 05)) = 366
X
1
{ − ̂ ( )}2 
 () =
ˆ
   ()


2

=1

µ

 − 


¶

• Then lower and upper limit of uniform CI are given by
q 2
 ()
o
n
 ×ˆ()

+  ×
 () = ̂ () −


q 2
 ()
o
n
 ×ˆ()

+  ×
 () = ̂ () +
.


• Use additional assumption:
NW6: The density of  is bounded away from zero on its support.
• Justification of the above algorithm comes from theorem 4.3.1 of Hardle, 1990. Our
assumptions NW will imply conditions A1-A5 of Hardle (1990). To see this note

that NW5’ implies A1 (note that for a binary outcome  , the variance  ( |) =

 ( |) × (1 −  ( |)) so that differentiability of  ( |) implies differentiability

of  ( |)); the Epanechnikov kernel satisfies condition A2, the outcome variables

lie in [0 1] with probability one, implying A3; condition NW6 implies A4 and choice
of bandwidth   equal to −14 implies A5.

Appendix page16

Working Paper Series
A series of research studies on regional economic issues relating to the Seventh Federal
Reserve District, and on financial and economic topics.
Standing Facilities and Interbank Borrowing: Evidence from the Federal Reserve’s
New Discount Window
Craig Furfine

WP-04-01

Netting, Financial Contracts, and Banks: The Economic Implications
William J. Bergman, Robert R. Bliss, Christian A. Johnson and George G. Kaufman

WP-04-02

Real Effects of Bank Competition
Nicola Cetorelli

WP-04-03

Finance as a Barrier To Entry: Bank Competition and Industry Structure in
Local U.S. Markets?
Nicola Cetorelli and Philip E. Strahan

WP-04-04

The Dynamics of Work and Debt
Jeffrey R. Campbell and Zvi Hercowitz

WP-04-05

Fiscal Policy in the Aftermath of 9/11
Jonas Fisher and Martin Eichenbaum

WP-04-06

Merger Momentum and Investor Sentiment: The Stock Market Reaction
To Merger Announcements
Richard J. Rosen

WP-04-07

Earnings Inequality and the Business Cycle
Gadi Barlevy and Daniel Tsiddon

WP-04-08

Platform Competition in Two-Sided Markets: The Case of Payment Networks
Sujit Chakravorti and Roberto Roson

WP-04-09

Nominal Debt as a Burden on Monetary Policy
Javier Díaz-Giménez, Giorgia Giovannetti, Ramon Marimon, and Pedro Teles

WP-04-10

On the Timing of Innovation in Stochastic Schumpeterian Growth Models
Gadi Barlevy

WP-04-11

Policy Externalities: How US Antidumping Affects Japanese Exports to the EU
Chad P. Bown and Meredith A. Crowley

WP-04-12

Sibling Similarities, Differences and Economic Inequality
Bhashkar Mazumder

WP-04-13

Determinants of Business Cycle Comovement: A Robust Analysis
Marianne Baxter and Michael A. Kouparitsas

WP-04-14

The Occupational Assimilation of Hispanics in the U.S.: Evidence from Panel Data
Maude Toussaint-Comeau

WP-04-15

1

Working Paper Series (continued)
Reading, Writing, and Raisinets1: Are School Finances Contributing to Children’s Obesity?
Patricia M. Anderson and Kristin F. Butcher

WP-04-16

Learning by Observing: Information Spillovers in the Execution and Valuation
of Commercial Bank M&As
Gayle DeLong and Robert DeYoung

WP-04-17

Prospects for Immigrant-Native Wealth Assimilation:
Evidence from Financial Market Participation
Una Okonkwo Osili and Anna Paulson

WP-04-18

Individuals and Institutions: Evidence from International Migrants in the U.S.
Una Okonkwo Osili and Anna Paulson

WP-04-19

Are Technology Improvements Contractionary?
Susanto Basu, John Fernald and Miles Kimball

WP-04-20

The Minimum Wage, Restaurant Prices and Labor Market Structure
Daniel Aaronson, Eric French and James MacDonald

WP-04-21

Betcha can’t acquire just one: merger programs and compensation
Richard J. Rosen

WP-04-22

Not Working: Demographic Changes, Policy Changes,
and the Distribution of Weeks (Not) Worked
Lisa Barrow and Kristin F. Butcher

WP-04-23

The Role of Collateralized Household Debt in Macroeconomic Stabilization
Jeffrey R. Campbell and Zvi Hercowitz

WP-04-24

Advertising and Pricing at Multiple-Output Firms: Evidence from U.S. Thrift Institutions
Robert DeYoung and Evren Örs

WP-04-25

Monetary Policy with State Contingent Interest Rates
Bernardino Adão, Isabel Correia and Pedro Teles

WP-04-26

Comparing location decisions of domestic and foreign auto supplier plants
Thomas Klier, Paul Ma and Daniel P. McMillen

WP-04-27

China’s export growth and US trade policy
Chad P. Bown and Meredith A. Crowley

WP-04-28

Where do manufacturing firms locate their Headquarters?
J. Vernon Henderson and Yukako Ono

WP-04-29

Monetary Policy with Single Instrument Feedback Rules
Bernardino Adão, Isabel Correia and Pedro Teles

WP-04-30

2

Working Paper Series (continued)
Firm-Specific Capital, Nominal Rigidities and the Business Cycle
David Altig, Lawrence J. Christiano, Martin Eichenbaum and Jesper Linde

WP-05-01

Do Returns to Schooling Differ by Race and Ethnicity?
Lisa Barrow and Cecilia Elena Rouse

WP-05-02

Derivatives and Systemic Risk: Netting, Collateral, and Closeout
Robert R. Bliss and George G. Kaufman

WP-05-03

Risk Overhang and Loan Portfolio Decisions
Robert DeYoung, Anne Gron and Andrew Winton

WP-05-04

Characterizations in a random record model with a non-identically distributed initial record
Gadi Barlevy and H. N. Nagaraja

WP-05-05

Price discovery in a market under stress: the U.S. Treasury market in fall 1998
Craig H. Furfine and Eli M. Remolona

WP-05-06

Politics and Efficiency of Separating Capital and Ordinary Government Budgets
Marco Bassetto with Thomas J. Sargent

WP-05-07

Rigid Prices: Evidence from U.S. Scanner Data
Jeffrey R. Campbell and Benjamin Eden

WP-05-08

Entrepreneurship, Frictions, and Wealth
Marco Cagetti and Mariacristina De Nardi

WP-05-09

Wealth inequality: data and models
Marco Cagetti and Mariacristina De Nardi

WP-05-10

What Determines Bilateral Trade Flows?
Marianne Baxter and Michael A. Kouparitsas

WP-05-11

Intergenerational Economic Mobility in the U.S., 1940 to 2000
Daniel Aaronson and Bhashkar Mazumder

WP-05-12

Differential Mortality, Uncertain Medical Expenses, and the Saving of Elderly Singles
Mariacristina De Nardi, Eric French, and John Bailey Jones

WP-05-13

Fixed Term Employment Contracts in an Equilibrium Search Model
Fernando Alvarez and Marcelo Veracierto

WP-05-14

Causality, Causality, Causality: The View of Education Inputs and Outputs from Economics
Lisa Barrow and Cecilia Elena Rouse

WP-05-15

3

Working Paper Series (continued)
Competition in Large Markets
Jeffrey R. Campbell

WP-05-16

Why Do Firms Go Public? Evidence from the Banking Industry
Richard J. Rosen, Scott B. Smart and Chad J. Zutter

WP-05-17

Clustering of Auto Supplier Plants in the U.S.: GMM Spatial Logit for Large Samples
Thomas Klier and Daniel P. McMillen

WP-05-18

Why are Immigrants’ Incarceration Rates So Low?
Evidence on Selective Immigration, Deterrence, and Deportation
Kristin F. Butcher and Anne Morrison Piehl

WP-05-19

Constructing the Chicago Fed Income Based Economic Index – Consumer Price Index:
Inflation Experiences by Demographic Group: 1983-2005
Leslie McGranahan and Anna Paulson

WP-05-20

Universal Access, Cost Recovery, and Payment Services
Sujit Chakravorti, Jeffery W. Gunther, and Robert R. Moore

WP-05-21

Supplier Switching and Outsourcing
Yukako Ono and Victor Stango

WP-05-22

Do Enclaves Matter in Immigrants’ Self-Employment Decision?
Maude Toussaint-Comeau

WP-05-23

The Changing Pattern of Wage Growth for Low Skilled Workers
Eric French, Bhashkar Mazumder and Christopher Taber

WP-05-24

U.S. Corporate and Bank Insolvency Regimes: An Economic Comparison and Evaluation
Robert R. Bliss and George G. Kaufman

WP-06-01

Redistribution, Taxes, and the Median Voter
Marco Bassetto and Jess Benhabib

WP-06-02

Identification of Search Models with Initial Condition Problems
Gadi Barlevy and H. N. Nagaraja

WP-06-03

Tax Riots
Marco Bassetto and Christopher Phelan

WP-06-04

The Tradeoff between Mortgage Prepayments and Tax-Deferred Retirement Savings
Gene Amromin, Jennifer Huang,and Clemens Sialm

WP-06-05

Why are safeguards needed in a trade agreement?
Meredith A. Crowley

WP-06-06

4

Working Paper Series (continued)
Taxation, Entrepreneurship, and Wealth
Marco Cagetti and Mariacristina De Nardi

WP-06-07

A New Social Compact: How University Engagement Can Fuel Innovation
Laura Melle, Larry Isaak, and Richard Mattoon

WP-06-08

Mergers and Risk
Craig H. Furfine and Richard J. Rosen

WP-06-09

Two Flaws in Business Cycle Accounting
Lawrence J. Christiano and Joshua M. Davis

WP-06-10

Do Consumers Choose the Right Credit Contracts?
Sumit Agarwal, Souphala Chomsisengphet, Chunlin Liu, and Nicholas S. Souleles

WP-06-11

Chronicles of a Deflation Unforetold
François R. Velde

WP-06-12

Female Offenders Use of Social Welfare Programs Before and After Jail and Prison:
Does Prison Cause Welfare Dependency?
Kristin F. Butcher and Robert J. LaLonde
Eat or Be Eaten: A Theory of Mergers and Firm Size
Gary Gorton, Matthias Kahl, and Richard Rosen
Do Bonds Span Volatility Risk in the U.S. Treasury Market?
A Specification Test for Affine Term Structure Models
Torben G. Andersen and Luca Benzoni

WP-06-13

WP-06-14

WP-06-15

Transforming Payment Choices by Doubling Fees on the Illinois Tollway
Gene Amromin, Carrie Jankowski, and Richard D. Porter

WP-06-16

How Did the 2003 Dividend Tax Cut Affect Stock Prices?
Gene Amromin, Paul Harrison, and Steven Sharpe

WP-06-17

Will Writing and Bequest Motives: Early 20th Century Irish Evidence
Leslie McGranahan

WP-06-18

How Professional Forecasters View Shocks to GDP
Spencer D. Krane

WP-06-19

Evolving Agglomeration in the U.S. auto supplier industry
Thomas Klier and Daniel P. McMillen

WP-06-20

Mortality, Mass-Layoffs, and Career Outcomes: An Analysis using Administrative Data
Daniel Sullivan and Till von Wachter

WP-06-21

5

Working Paper Series (continued)
The Agreement on Subsidies and Countervailing Measures:
Tying One’s Hand through the WTO.
Meredith A. Crowley

WP-06-22

How Did Schooling Laws Improve Long-Term Health and Lower Mortality?
Douglas Almond and Bhashkar Mazumder

WP-06-23

Manufacturing Plants’ Use of Temporary Workers: An Analysis Using Census Micro Data
Yukako Ono and Daniel Sullivan

WP-06-24

What Can We Learn about Financial Access from U.S. Immigrants?
Una Okonkwo Osili and Anna Paulson

WP-06-25

Bank Imputed Interest Rates: Unbiased Estimates of Offered Rates?
Evren Ors and Tara Rice

WP-06-26

Welfare Implications of the Transition to High Household Debt
Jeffrey R. Campbell and Zvi Hercowitz

WP-06-27

Last-In First-Out Oligopoly Dynamics
Jaap H. Abbring and Jeffrey R. Campbell

WP-06-28

Oligopoly Dynamics with Barriers to Entry
Jaap H. Abbring and Jeffrey R. Campbell

WP-06-29

Risk Taking and the Quality of Informal Insurance: Gambling and Remittances in Thailand
Douglas L. Miller and Anna L. Paulson

WP-07-01

Fast Micro and Slow Macro: Can Aggregation Explain the Persistence of Inflation?
Filippo Altissimo, Benoît Mojon, and Paolo Zaffaroni

WP-07-02

Assessing a Decade of Interstate Bank Branching
Christian Johnson and Tara Rice

WP-07-03

Debit Card and Cash Usage: A Cross-Country Analysis
Gene Amromin and Sujit Chakravorti

WP-07-04

The Age of Reason: Financial Decisions Over the Lifecycle
Sumit Agarwal, John C. Driscoll, Xavier Gabaix, and David Laibson

WP-07-05

Information Acquisition in Financial Markets: a Correction
Gadi Barlevy and Pietro Veronesi

WP-07-06

Monetary Policy, Output Composition and the Great Moderation
Benoît Mojon

WP-07-07

Estate Taxation, Entrepreneurship, and Wealth
Marco Cagetti and Mariacristina De Nardi

WP-07-08

6

Working Paper Series (continued)
Conflict of Interest and Certification in the U.S. IPO Market
Luca Benzoni and Carola Schenone
The Reaction of Consumer Spending and Debt to Tax Rebates –
Evidence from Consumer Credit Data
Sumit Agarwal, Chunlin Liu, and Nicholas S. Souleles
Portfolio Choice over the Life-Cycle when the Stock and Labor Markets are Cointegrated
Luca Benzoni, Pierre Collin-Dufresne, and Robert S. Goldstein
A Nonparametric Analysis of Black-White Differences in Intergenerational Income
Mobility in the United States
Debopam Bhattacharya and Bhashkar Mazumder

WP-07-09

WP-07-10

WP-07-11

WP-07-12

7