Full text of Working Papers (Federal Reserve Bank of Chicago) : Employer Learning and Statistical Discrimination, Working Paper 1997-11

View original document
The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
Em ployer Learning and Statistical
Discrim ination
Joseph G. Altonji, Charles R. Pierret

Working Papers Series
Macroeconomic Issues
Research Department
Federal Reserve Bank of Chicago
December 1997 (WP-97-11)

FEDERAL RESERVE B A N K
OF CHICAGO

First Draft: July 1995
This version: October 1997

Employer Learning and Statistical Discrimination

Joseph G. Altonji
Northwestern University and Federal Reserve Bank of Chicago

Charles R. Pierret
Bureau ofLabor Statistics

This research was supported by the Institute for Policy Research, Northwestern University, the Bureau of
Labor Statistics, U.S. Department of Labor, and the National Science Foundation. W e owe a special debt to
Nachum Sicherman for assisting us with the NLSY data. W e thank Paul Devereux, Judith Hellerstein,
Derek Neal, Bruce Weinberg, and participants in seminars at BLS, Berkeley, Boston University, Columbia,
The Federal Reserve Bank of Chicago, Indiana University, University ofMaryland, McMaster, NBER,
Northwestern, the University College London, the Upjohn Institute and the U. of Western Ontario for
helpful comments. W e are responsible for allerrors and omissions. The opinions stated in this paper do not
necessarily represent the official position or policy ofthe U.S. Department ofLabor, the Federal Reserve
Bank of Chicago, or the Federal Reserve System.




Employer Learning and Statistical Discrimination
Abstract

We provide a test for statistical discrimination or “rational” stereotyping in environments in which agents
learn over time. Our application is to the labor market. If profit maximizing firms have limited information
about the general productivity of new workers, they may choose to use easily observable characteristics
such as years of education to "statistically discriminate" among workers. As firms acquire more
information about a worker, pay will become more dependent on actual productivity and less dependent on
easily observable characteristics or credentials that predict productivity. Consider a wage equation that
contains both the interaction between experience and a hard to observe variable that is positively related to
productivity and the interaction between experience and a variable that firms can easily observe, such as
years of education. We show that the wage coefficient on the unobservable productivity variable should
rise with time in the labor market and the wage coefficient on education should fall. We investigate this
proposition using panel data on education, the AFQT test, father's education, and wages for young men and
their siblings from NLSY. We also examine the empirical implications of statistical discrimination on the
basis of race. Our results support the hypothesis of statistical discrimination, although they are inconsistent
with the hypothesis that firms fully utilize the information in race. Our analysis has wide implications for the
analysis of the determinants of wage growth and productivity and the analysis of statistical discrimination in
the labor market and elsewhere.
JEL Classification: D83, J31
Joseph G. Altonji
Department of Economics
Northwestern University
Evanston, IL 60208
(847) 492-8218
altonji@nwu.edu




Charles Pierret
Bureau of Labor Statistics
2 Massachusetts Ave. NE Suite 4945
Washington, D.C. 20212
(202) 606-7519
pierret_c@bls.gov

1
1. Introduction
People go through lifemaking an endless stream ofjudgments on the basis oflimited information
about matters as diverse as the safety of a street, the quality of a car, the suitability of a potential spouse,
and the skilland integrity ofa politician. When hiring, employers must assess the value ofpotential
workers with only the information contained inresumes, recommendations, and personal interviews.
What do employers know about the productivity ofyoung workers, and how quickly do they learn?
Given lack ofinformation about actual productivity, do employers "statisticallydiscriminate" among
young workers on the basis of easily observable variables such as education, race, and other clues to a
worker's labor force preparation. Many issues in labor economics hinge on the answers, including the
empirical relevance ofthe signaling model of education (Weiss (1995), statisticaltheories of
discrimination (Aigner and Cain (1977), Lundberg and Startz (1983)), and the interpretation of earnings
dynamics. The desirability of changes inthe laws governing hiring procedures, evaluation ofemployees,
layoffand firing costs, and the provision ofreferences for former employees also hinge on the answers.
Although labor economists typically assume wages are strongly influenced by employer beliefs about
worker productivity, there islittleempirical research on how much employers know about theirworkers,
or about how this information changes with time inthe labor market.1
In this paper we explore the implications of a hypothesis that we referto as Statistical
Discrimination with Employer Learning, or SD-EL. Our working hypothesis isthat firms have only
limited information about the quality ofworkers inthe early stages oftheir careers. They distinguish
among workers on the basis ofeasily observable variables that are correlated with productivity such as
years ofeducation or degree, the quality ofthe school the person attended, race, and gender. (To avoid
misunderstanding we wish to stress that part ofthe relationship between wages and race and gender
may reflect biased inferences on the part ofemployers or other forms ofdiscrimination that have nothing
to do with productivity or information.) Firms weigh this information with other information about
outside activities, work experience to date, references, thejob interview, and perhaps formal testing by
the firm. Each period, the firm observes noisy indicators ofthe worker’s performance. Over time, these
make the information observed at the startredundant. Wages become more closely tied to actual
productivity and less strongly dependent upon the information that was readily available at the beginning
ofa worker's career. The main contribution ofthe paper isto provide a way to test for whether firms
'Thereisa largeempiricaland theoreticalliteratureon labormarket searchandon theeffectsoflearningaboutthe
qualityofthejob match on wages and mobility. SeeDevineand Kiefer(1991)foracomprehensive survey.



2
statisticallydiscriminate on the basis ofreadily available information such as education and race. W e
also provide a way to estimate the learning profile of firms and address the issue ofwhether firms have a
stable view ofthe productivity ofworkers with many years oflabor market experience.
Our research builds on some previous work, particularly Farber and Gibbons (1996).2 Farber
and Gibbons investigate three implications of employer learning. Imagine a variable s (say schooling)
which firms can observe directly and a second variable, z (say AFQT test scores or sibling'swage rate)
which firms cannot observe directly. They show firstthat employer learning does not imply that the
coefficient on s in a wage regression will change with experience. This isbecause future observations,
on average, simply validate the relationship between expected productivity and s for new entrants. Their
empirical evidence isgenerally supportive ofthisresult, although they note that a positive interaction
could arise ifschooling iscomplementary with training. (Positive interactions are found in a number of
data sets, including the PSID.) Second, they show that the part ofz that isorthogonal to information
available to employers at the beginning of a worker's careers will have an increasingly large association
with wages as time passes. Third, they note that wage growth willbe a Martingale process, at least in
the case in which productivity ofthe worker isconstant.
In this paper we focus on a different but related proposition that allows us to examine the issue
of statistical discrimination. The proposition concerns how controlling for the experience profile ofthe
effect ofz on wages altersthe interaction between experience and s. W e show that not only should the
coefficient on z risewith time in the labor market, but the coefficient on s should fall. W e investigate
these propositions using data on young men from the NLSY. W e also explore the implications of
statistical discrimination on the basis ofrace, which isalso easily observable to employers and is

2OtherrelevantreferencesareGibbonsand Katz (1991)whichwe discussbelowandParsons (1993). Glaezer(1992)
usesvariancesinwage innovationsasa measureoflearning. Hiswork issomewhatcloselyrelatedtoFarberand
Gibbons. However, heattemptstodistinguishbetweeninformationthatisspecifictothejobmatchand informationabout
generalproductivity. Fosterand Rosenzweig (1993)usedataon piecerateand time-rateworkersto investigateseveral
implicationsofimperfectinformationon thepartofemployersthataredifferentfromtheone studiedhere. Theirresults
implythattheincompletenessofemployerinformationisan importantissue. Studiesfollowingperformanceevaluations
withinfirmsbasedon theEOPP data,orstudiesusingfirmpersonnelfiles(Medoffand Abraham (1980))arealso
relevant,buthave averydifferentfocusthanthepresentpaper. Parsons(1986), Weiss (1995)and Carmichael (1989)
provideusefuldiscussionsofsome ofthetheoreticalissueson thelinkbetweenwages andemployerperceptionsabout
productivity. Albrecht(1982) conductsa testofscreeningmodelsofeducationbasedon theideathateducationwillhave
lessimpacton theprobabilitya workerwillbe hirediftheworkerwas referredtothefirmby anotherworkerbecause
some oftheinformationcontainedineducationwillbe transmittedthroughthereferral.Montgomery (1991)presentsa
model inwhich employers obtainvaluableinformationon theproductivityofnew employersthroughreferralsand ispart
ofalargeliteratureon labormarket networks. ForempiricalevidenceseeHolzer(1988).



3
correlated with hard to observe background variables that influence productivity.3 While our basic
theoretical framework and most ofthe empirical analysis assumes that allemployers have the same
information about workers, we provide a preliminary discussion ofthe implications ofmodels inwhich
the current employer has an information advantage.
In Section 2 we present our basic theoretical framework in a setting inwhich information is
public, and then informally discuss the case inwhich itisprivate. W e also consider the effect that
associations between s,z, and job trainingwould have on the analysis. In Section 3 we discuss the
NLSY data used inthe study. In Section 4 and 5 we present our results for education and race. In
section 6 we present results in which we control forjob training. In section 7 we discuss the case in
which employer information isprivate and provide some evidence on how hard to observe variables are
related to the probability of a layoffand the wage losses associated with layoffs. In section 8 we point
out that interpreting our estimates of the time profile ofthe effect of AFQT on wages as the result of
employer learning implies that high abilityworkers would have a substantial financial incentive to take
the AFQT to differentiate themselves from those who are less able in this dimension. The factthat we
do not generally observe this raises additional research questions. In section 9 we close the paper with a
discussion of some ofthe implications of our analysis for a number of standard topics in labor economics
and a research agenda.

2. Implications of Statistical Discrimination and Employer Learning for Wages.
2,1 A Model ofEmployer Learning and Wages
In this section we show how the wage coefficients on characteristics that employers can observe
directly and on characteristics they cannot observe directly will change with experience ifemployers
statisticallydiscriminate and become better informed about workers over time.

3We areusingtheterm "statisticaldiscrimination" assynonymous withtheuseoftheterm“rationalexpectations” inthe
economics literature. We mean thatintheabsenceoffullinformation,firmsdistinguishbetweenindividualswith
differentcharacteristicsbasedon statisticalregularities. Inotherwords,we mean thatfirmsform stereotypesthatare
rationalinthesensethattheyareconsistentwithreality. Many papersthatusetheterm statisticaldiscriminationanalyze
raceorgenderdifferentialsthatarisebecausefirmshave troubleprocessingtheinformationtheyreceiveaboutthe
performance ofminoritygroup members. Thisdifficultymay leadtonegativeoutcomes forminoritiesbecauseitlowers
theirincentivestomake unobservableinvestmentsthatraiseproductivity. Italsomay leadtonegativeoutcomes ifthe
productivityofajob match depends on thefitbetweentheworkerand thejob. Some papersalsoconsiderwhetherfirms
thatstartwithincorrectbeliefsabouttherelationshipbetweenpersonalcharacteristicsand productivity(inaccurate
stereotypes)would correctthem, and, inmodels withworkerinvestment,whetherthepriorsheldbyfirmsmay be self
fulfilling. SeeAignerand Cain (1977),Lundberg and Startz(1983),Lang (1986), and CoateandLoury (1993)and
Oettinger(1996). InOettinger’smodel productivityismatch specificandproductivitysignalsarenoisierforblacksthan
whites. As a resultthesortingprocessacrossjobchanges islessefficientforblacks,and a racegap developsovertime.



4

Our model isvery similarto Farber and Gibbons (1996). Let y* be the log oflabor market
productivity of worker iwith tiyears ofexperience, ya isdetermined by
(1)

yit= rsi+ H(ti) + aiqi + Azj + t|i

where $ isyears of schooling, Ziisa vector ofcorrelates of productivity that are not observed directlyby
employers but are available to the econometrician, and H(ti) isthe experience profile of productivity.
The variable r\t consists of other determinants of productivity and isnot directly observed by the
employers or the econometrician. The elements ofz* might be a test score, the income of an older
sibling, father's education, or indicators ofchildhood environment such as books inthe home or
ownership of a library card. W e normalize Ziso that allthe elements ofthe conformable coefficient
vector A are positive. Without loss ofgenerality we scale r|iso that ithas a unit coefficient inthe
productivity equation.
In addition to Si,the employer observes a vector qiof other information about the worker that is
relevant to productivity. The elements of qiare related to productivity by the coefficient vector cti. For
now we assume that the experience profile ofproductivity does not depend on Si,Zi, qi,or r|i. In section
2.2 we discuss the sensitivity of our analysis to thisassumption. In most ofthe analysis we suppress the
isubscript. All variables are expressed as deviations from population means. Although we use years of
schooling and race as our examples of s, our analysis applies to any variablethat employers can easily
observe.
W e assume that the conditional expectations ofE(z|s,q) and E(r||s,q) are linearin q and s,so
(2)

z = E(z|s,q) + v = Y i q + Y2S + V

(3)

q = E(q|s,q) + e = a2s + e ,

where the vector v and the scalar e have mean 0 and are uncorrelated with q and sby definition ofan
expectation.4 The links from s to z and q may be partiallydue to a causal effect of s.5 Equations (1),
(2) and (3) imply that Av + e isthe error inthe employer's beliefabout the log ofproductivity ofthe
worker at the time the worker enters the labor market. The sum Av+e isuncorrelated with s and q. W e
make the additional assumption that Av +e isindependent of q and s.
4The exclusionofq from theconditionalmean ofq isinnocuous, sincewe aresimplydefiningq and thecoefficient
vectoro£ionq in(1)sothatthemean ofq doesnotdepend on q.
5 Forexample,below we usetheArmed ForcesQualificationTest (AFQT) asz and yearsofeducationass,andNeal and
Johnson (1996)presentevidencethatyearsofeducationhavea sizablepositiveeffecton AFQT.



5
Each period that a worker isin the labor market, firms observe a noisy signal ofthe productivity
ofthe worker,
(4)

£t= y + et

where y isyt-H(t) and etreflectstransitory variation inthe performance ofworker iand the effects of
variation in the firm environment that are hard for the firm to control for in evaluating the worker. (We
continue to suppress the isubscripts.) The term etisassumed to be independent ofthe other variables in
the model. W e are also implicitly assuming that the component of etthat reflectstemporal variation in
productivity from sources specific to worker iis seriallyuncorrelated, because otherwise firms would
have an incentive to base compensation int+1 on what they know about the worker specific component
ofet.6However, etmay be serially correlated as a result ofthe other factors.
Since the employers know q and s,observing
(5)

isequivalent to observing

dt= Av + e + 6, = £t-E(y|s,q)
The vector D t={di,d2,...,dt} summarizes the worker's performance history. Let

be the

differencebetween Av+e and E(Av+e|Dt). By definition \it isuncorrelated with D t,q and s but in
additionwe assume |itisdistributed independently ofD t,q and s.
W e also assume for now that q, s,and the worker's performance history (summarized by the
vector D t={di,d2...dt} are known to allemployers, as inFarber and Gibbons. (We discuss the private
information case in Section 7.) As a result of competition among firms, the worker receives a wage W t
equal to the expected value of productivity Y t(Yt= exp(yt))times the multiplicative error component
exp(s,) that reflects measurement error and firm specific factors that are outside the model and are
unrelated to s,z, and q. The wage model is
(6)

Wt = E(Yt|s,q,Dt)e*

Using (1), (2), (3) and (6) leads to
(7)

Wt = E(Yt IS,q,Dt)ef*= ers+H(t) e(a,+An)q+(“2+A,'j)s eE(Av+e|Dl) E(e*) e*

Taking logs and collecting terms leads to
(8 )

w t= (r + AY2 + a2)s+ H*(t) + (Ayi + cti)q+ E(Av+e |Dt) +

6 The firm’
sknowledge ofa seriallycorrelatedproductivitycomponentwould implyseriallycorrelatedtransitoryvariation

inthewage errorofthetypefoundby Faiberand Gibbons (1996),butwould nothave much effecton ouranalysis.



6
where w t= log(Wt) and H*(t) = H(t) +log(E(e^)). W e will suppress the Q term inthe equations that
follow.
In the context of the debate over signaling models of education Riley (1979) and others have
noted that unless the relationship between schooling and actual productivity changes, the coefficient on s
will not change. This istrue regardless ofwhv s isrelated to productivity. Farber and Gibbons also
make thispoint by showing in a similar model that the expected value ofthe coefficient of an OLS
regression ofw ton s does not depend on t. They estimate an equation ofthe form
(8a)

w t= bsts+ H(t) + otiq+ E(Av + e|Dt)

with q treated as an error component. They find that brtdoes not depend much on t.
Farber and Gibbons also make a second point, which isthat ifone adds the component z'of(Av
+ e) that isuncorrelated with the employer's initialinformation s and q to the wage equation and
estimates
(8b)

w t= brts+ bz-tz'+ H(t) + aiq + E(Av + e|Dt),

the coefficients on s do not depend on t.This follows almost immediately from the firstresult, because
adding a second variable to a regression model has no effect on the expected value ofthe firstifthe two
are uncorrelated. They provide evidence from NLSY that bjtisrelatively constant and bz>tisincreasing in
t.
In this paper, we establish a related set ofresults that permit one to examine the issue of
statistical discrimination. W e begin with the case inwhich z and s are scalars and then consider the more
general cases. Among those who are working the means ofq, s,and z may depend on t although this
will influence estimates ofH*(t). However, we assume throughout that among those who are working
the covariances among q, s, and z do not depend on t. Under these assumptions the variances and
covariances involving q, s, and z and the regression coefficients <&q,and Oqz defined in (10) below do not
vary with t.7

Case 1: z isa scalar.

7 EstimatesoftheexperienceprofileH*(t)willbe affectedifthemeans ofs,q,and zdepend on tbutthishas nobearing
on ouranalysis.



7

The analysis iscleanest when z and s are scalars. Least squares regression will identifythe
parameters ofthe expectation of w ton s,z, and experience profileH (t).8Let b,tand b* be the
coefficients on s and z in the conditional expectation function when t=0...T, with
(9)

E(wt|s.z.t)=bstS+ bztz + H*(t) .

When the individual starts work (tis0) this equation is
(9a)

E(wo |s.z.0)= bsos + bzoz + H*(0)

To simplify the algebra but without any additional assumptions we re-interpret s, z, and q as components
of s,z, and q that are orthogonal to H*. Then the wage process (7), the fact that E(Av+e|D0)is0 (since
there isno work history when t=0), and some straightforward algebra involving the least squares
regression omitted bias formula implies that
1

1

_

~r + A y 2+cc 2 ~
0.

_bzo_

where

<l>qs
+
Oqz

and <5^ are the coefficients ofthe auxiliary regression of(cti+ Ayi)q on s and z.. The

parameters {brt,b*}' are the sum ofthe {b,o, bzo}’and the coefficients ofthe regression ofE(Av+e|Dt)
on s and z. That is,
V

bso

i

bzo_

\var(s,z)\

var(z) -cov (s,z)
_-cov (s,z)

var(s)

O'
c o v fv , E ( Av + e \ D t)

where |Var(s,z)| isthe determinant ofVar(s,z) and we use the facts that cov(s, E(Av+e|Dt))=0 and
cov(z,E(Av+e|Dt))=cov(v,E(Av+e|Dt)). This may be rewritten as
bso
bn.

bzo_

i
\var(s,z)\

var(z) -coy (s,z)

O'
Or

_-coy(s,z)

var(s)_

A v a r(v ) +cov(v,e)_

or
(12a) b„ = bso + 0tOs
(12b) bzt= bzo+ 0t<I>z
where <J>,and <$zare the coefficients ofthe regression ofAv+e on s and z and

8 Technically, itidentifiesthecoefficientsoftheleastsquareslinearprojectionofwton s,z,and H*(t)ifE(Av+e|Dt)isnot
linearinthefunctionsofs,z,andtwe introduceinourregressionmodels. We ignorethisdistinction.



s
0t= cov(E(Av+e|Dt),z)/cov(Av+e, z)= cov(E(Av+e|Dt),v)/cov(Av+e, v)
isa parameter that isspecific to the experience level t.Note that 0tO, and 0tO zare the coefficients of
the regression ofE(Av+e|Dt) on s and z and that 0tsummarizes how much the firm knows about Av + e
at time t.Itis easy to show (see Appendix 1) that 3>s = -

where <!>*,isthe coefficient ofthe

regression ofz on s. (This isthe basis of proposition 3 below.)
To determine the behavior of 0tO, and 0t3>zover time, note firstthat O s< 0 and 4>z> 0 ifcov (v,
Av+e) > 0 and cov(s,z) >0. The lattercondition istrue when s isschooling and the scalar z isAFQT,
father's education, or the wage rate of an older sibling. The condition cov(v,Av+e) > 0 simply states
that the unobserved (by the firm) productivity subcomponent v and composite unobserved productivity
term Av+e have a positive covariance. This seems plausible to us for the z variables we consider.
The change over time inbatand b* isdetermined by 0t.Intuitively, 0tisbounded between 0 and
1. Itis0 in period 0, because inthis period employers know nothing about Av + e, so E(Av+e|Do)=0.
The coefficient is 1 ifE(Av+e|Dt) isAv+e, since inthis case the employer has learned what Av+e isand
thus knows productivity y. Itisalso intuitivethat 0tisnondecreasing intbecause the additional
information that arrives as the worker’s career progresses permits a tighter estimate ofAv+e.9
The regularity conditions on the etprocess that are required for the time average ofetto
converge almost surely to 0 as tbecomes large constitute sufficient conditions for 0tto converge to 1 as
tbecomes large. (See Theorem 3.47 in White (1984) for a very general set of conditions.) These
conditions limit the degree ofindependence among the etand also restrict the variances. The intuition
for this isthat future values of etmust be sufficientlyindependent ofthe earliere'sto average out, and
must not be so variable that the future dtvalues have no new information about Av+e.10
A simple example may be helpful. Ifetisiidwith variance o€2,then 0thas the familiar form

9To establishthisnotethatsinceDt.iisa subsetoftheinformationinDt,
[Cov(v,E(Av+e|Dt) -E(Av+e|Dt.i))]/Cov(v,Av+e) = 0t-0t.i* 0.
10To establishtheresultnotethatineachperiod,firmsobservedt=Av + e+ et.Ingeneral,theformofE(Av+eP0 will
depend onthepatternofserialcorrelationand therelativevariancesofe^.e,. However, thefirmcanalwayschoosetouse
t

E(Av+e|Dt),where £>tisthetime average

(Av + e + ek)/t,asan unbiasedbutperhaps inefficientestimatorgivenDt.
i
IfastgoestoinfinityD convergesalmostsurelytoAv+e, then Cov(E(Av+e|£>t,v)/Cov(Av+e,v) convergesto Iastgoes
toinfinity. SinceE(Av+e|Dt) ismore efficientthanE(Av+e|D0|, E(Av+e|Dt)must alsoconvergealmostsurelytoAv+e,
whichestablishesthatCov(E(Av+e|Dt),v)/Cov(Av+e, v)convergesto 1. We concludethat0tconverges almostsurelyto
1astbecomes large.




9

(13)

=

e,

vart-Av^;

for, „

0o = O

v a t ( h v + e) + o 2
c /1

In this case, 0tis strictlyincreasing intbecause the independence among the etmeans that each ethas
some new information about 0t. 0t is0 when tis0 and converges to 1 as tgoes to infinity.
There are two conclusions, which we summarize inProposition 1 and 2:
P r o p o s it io n 1 : U n d e r the a ssu m p tio n s o f the a b o v e m o d el, the re g re s s io n c o e ffic ie n t bzt is
n o n d e c re a s in g in t.

T h e re g re s s io n c o e ffic ie n t bst is n o n in c re a s in g in t .11

P r o p o s itio n 2 : I f f ir m s h a v e co m p le te in fo rm a tio n a b o u t the p ro d u c tiv ity o f new w o rk ers, th en c b s/ d t =
dbz/dt

=

0.

These results underlie our empirical analysis below. Using AFQT and father's education as z
variables, we examine the experience profile ofbstand b*. The intuition for the decline inbstisthat as
employers learn the productivity ofworkers, swill get less ofthe credit for an association with
productivity that arises because s iscorrelated with z provided that z isincluded in the wage equation
with a time dependent coefficient and can claim the credit.12 W e also are able to estimate the time
profile of 0tup to scale. Under the assumption that employers learn about v and e at the same rate, this
enables us to estimate the time profile ofemployer learning about productivity up to scale. In AP (1996)
we examine the implications of our estimates for pure signaling models ofthe return to education.
The model also implies a third result, which we state in proposition 3.
P r o p o s itio n 3 : U n d e r the a ssu m p tio n s o f the a b o v e m o d el, c b s/d t

= -0 a

3bzfd t.

Since <E>»issimply the regression coefficient of z on s and can be estimated, the coefficient restriction in
Proposition 3 may provide leverage in differentiatingbetween the leaming/statistical discrimination
model and alternative explanations for the behavior ofba and b*.
Additional Empirical Implications
As noted in footnote 4, the literature on statistical discrimination as well as the literature on labor
market networks has emphasized differences across groups inthe amount ofinformation that isavailable
to firms (or the mapping between a given set ofdata and what the firm actually knows) may differ across

11 The coefficientson an unfavorablezcharacteristic,suchascriminalinvolvementoralcoholuse,willbecome more
negativetotheextentthatthesereflectpermanenttraits. Assuming sisnegativelycorrelatedwiththeunfavorablez,b*
willrisewitht As notedearlier,we have normalizedz sothatA > 0.
12Itmightbeparticularlyinterestingtoseeifthe "diplomaeffect"declineswithtwhilethecoefficientson hardtoobserve
productivitycharacteristicsthatcorrelatewithgettingadiploma rise.(SeeFrazis(1993) fora recentanalysisofwhether
thereisa diplomaeffect.)




10
groups. Our model implies that these differences will lead to group differences in wage dynamics. To
see this, suppose that there are two groups, 1 and 2. For group 2 the firm's initialinformation set is
larger than for group one. Consequently, Var(Av + e |group 2) < Var(Av + e |group 1) and cov(Av +
e,v |group 2) < cov(Av + e, v |group 1). From equation (10') or (11), itfollows that brtand b* vary
less over time for group 2 than group 1. In the extreme case, when firms are fullyinformed about group
2, cov(Av + e, v |group 2) is0 and ba and b* are constant. In future work, itwould be interesting to
use this implication as a way oftesting the hypothesis that the quality ofinformation that employers have
differs across labor force groups. Theories that stress differences inthe abilityof employers to evaluate
the performance of members of different groups imply different amounts ofnoise (from the point of
view of the employer) inthe signals dtand different paths of0t.
In standard labor data sets based on household interviews information on yiisnot available.
However, itisinteresting for at least two reasons to discuss the cross equation restrictionsbetween the
equation relating y;to s and z that are implied by the model. First, data could be gathered from both
firms and workers. Second, information on yior indicators ofyimay be available for use in other
applications of our methods to study statisticaldiscrimination. For example, inthe study ofmortgage
lending, panel data on households might provide data on both credit records (related to yO, success in
loan applications (the counterpart to w^ ), and hard to observe background variables (such as the income
and wealth ofrelatives). Suppose that one has a measure y;t*that isequal to yiplus noise Q t Assume
that Citisindependent ofallother variables inthe model. Then the model implies that
(13a) yit*= (b,o + <£«) Si+ (bzo + O z)z;+ error
where the error term isorthogonal to s and z. Note that the coefficients are time invariant. This
equation and (9) are heavily overidentified. By estimating the equationsjointly one can identify 0t
separately from <£,and O z. The availability of a productivity indicator would be particularly useful when
one relaxes the assumption that the effect of s and z on y istime invariant.

Case 2: z Is a vector

i

W e now consider the case inwhich z isa vector z- {zi.Z2,...,z|c,..zk). In this case,




bso
_bzo_

__

'r+A

y2+ci2 40. _dV

e
•8

(14)

11

where [Oqs,O^]' isthe 1 x (K+l) vector ofcoefficients from the auxiliary regression of (ai + Ayi)q
on s and z.
In the vector case [bst,ba] are:

(15)

bst

bso

bzt.

bzo.

O'

+ var(s,z)

cov(z,E(A v+ e|D„)

where var(s,z) isa K+l x K+l matrix, cov(z, E(Av+e|Dt)) isa K element vector and we have used
the fact that cov(s, E(Av+e|Dt))=0.
Let Gk be the k* row ofthe K x K matrix G

=

[va x (s)va i(z) - c o v (z , s ) c o v ( z ,s f\ l

where var(s) isthe variance ofthe scalar s,var(z) isthe variance ofthe vector z, and cov(z,s) isthe
vector ofcovariances between s and the elements of z. In appendix 1 we show that bstand bzktcan
be expressed as
(16)

bs,

=

n
bso - X ( cov^’*J G* •©* •[cov(z,A v+ e)])
k=l

(17)

bat = bao+var(s)Gk©Lk[cov(z,Av + e)]

where ©j^ = cov(E(Av+e|Dt),zk)/cov(Av+e, zk). Let

be the coefficient ofthe regression of zk

on s,k=l,..,,K.. Equations using (16) and (17) lead to Proposition 4, which isthe vector analog to
Proposition 3.

P r o p o s itio n 4 : W hen z is a v e c to r a n d the a ssu m p tio n s o f the a b o v e m o d e l h o ld ,

5 bst

d

bat

Proposition 2 generalizes to the vector case. Proposition 1 does not. With multiple z
variables, one cannot ingeneral sign <9b,t/dt and

do ^ I c i

even ifallthe elements ofA are positive,

each element of cov(z, Av + e) ispositive, and 4>ZtJ ispositive and ©J* isincreasing intfor allk.
However, from Proposition 4, itfollows immediately that if c b z^ / &




>0

and bZfct> 0 for allzkused

12
inthe analysis then d b j d i is< 0. W e can verify the conditions for a particular s and set ofz
variables.13
Ifthe ©i* are the same for each ofthe Zk and equal to the common value 0t,then the time

1
__

©
__
1

1
—

paths ofbstand the elements ofb* will allbe proportional to 0t:

lAoJ

The

+ 0 tr

.

.

willbe the same for allk ifthe following two conditions on the conditional distribution f

ofD tand the conditional mean ofZk hold:
C o n d itio n 1.
C o n d itio n 2.

f(Dt,|zfc,Av+ e) = f(Dt|Av + e) for allk ;
E(zk |Av + e) = <J>kR(Av + e) where <J>kisa scalarthat isspecific to k and R issome

function.
Basically, these conditions imply that the distribution ofthe signal D tisdriven by e + Av
and that the signal D tisnot more informative about particular elements ofv than others.14 The
condition will hold if,for example, dtisgenerated by (5) and e and the elements ofv are normally
distributed. The conditions rule out the possibilitythat the range of a particular element ofv, say vk,
iseither -100,000, 0, or 100,000. In this case, a very small or very large value ofD twould be very
informative about vk.

13Initiallywe were surprisedthattheconditions thatA > 0, cov(z,Av + e)> 0,and
isincreasingintforallk do
not guaranteethatbItispositiveeven if<X>da> 0 forallk. The intuitionisasfollows. The OLS estimatorof is
equivalenttoregressingthewage inperiodton theresidualsz k from theregressionofZkon s. z k isthesum ofvkplus
thecomponentofthekthelementofyiqthatisorthogonaltos. The components yiQ thatareorthogonaltosare
unrelatedtovkand ebutarelikelytobe correlatedacrosszk. Consequently,usingOLS toestimate
isanalogousto
applyingOLS ina situationinwhich severaloftheregressorsaremeasuredwitherror,and themeasurement errorsare
correlated. (The z k may be thoughtofasnoisy measuresofvk.)Itispossibleinsuch a situationfortheprobabilitylimit
oftheOLS estimatortotakeon thewrong sign.
M To establishtheconditions,notefirstthat
Cov((E(Av+e|Dt),zk)) = J JJzk-E(Av+e|Dt)•g(D,,Zk|Avk)•h(v)
Av+czitDt
= J JJZkE(Av +e|Dt)•g(Dt|Av) •f(zk|Av)•h(Av)
Av+ezkDt

= f JE(zk|Av)•E(E(Av+e|Dt)|Av) •h(Av)
Av+ezk
andthat
Cov(Av+e,zk) = J E(zk|Av+e)-(Av+e)-h(Av+e). ItiseasytoverifyfromtheseequationsthatCov(E(Av + e|Dt,
Av+e
Zk)/Cov(Zk|Av+ e)isthesame forallZkif(E(Zk|Av+e) =4>k^'(Av+ e).




13

These conditions are quite strong. For example, ifthe firm obtains indicators about
subcomponents ofv and e as well as y, then itislikelyto learn about some components of
productivity faster than others. In this case, equation (16) and (17) continue to hold, but the time
path ofthe education slope isa weighted average ofthe ©Lt that determine the time paths ofthe
individual Zk. The paths ofthe individual Zk will reflect differences across Zkin the rate at which
firms learn about the productivity components that they are correlated with. This isan important
result, because itstates that differences inthe effects of particularvariables on wage growth may
reflect differences inthe rate at which firms learn about the variables. This provides an alternative
or a complement to the standard view that the differential effects on growth rates reflect differences
inthe relationship between the variables and other sources ofwage growth such as on-the-job
training.

Case 3: s and z are both vectors.
Finally, we consider the case in which both s and z are vectors. In this case we reinterpret allof
the ofthe related variables and parameters inthe model, such as b^, bzo, v, etc as vectors or matrices.
The vectors ofcoefficients bso and bz<>on s and z inthe base year satisfy (14) where the vectors
and

(17b)

are the coefficients inthe regression of q on s and z. The vectors ba and b* are given by

V

Ac"

A.

A.

+ /4[cov(v,E { Av+ e\D t)] where

-var(5) 1cov(s,z)[var(z)-cov(z,s)var(.s) 'cov(s,z)]
A =

[var(z)- cov(z,s ) var(s)~' cov(s,z)]

Since -var(s)'1cov(s,z) isthe matrix of coefficients from the regression of z on s we obtain the vector
version ofproposition 4:
^si ~ bs0

—

~ b zo)>

where O a isredefined to be the matrix of coefficients ofthe regression ofthe vector z on the vector s.
When conditions 1 and 2 are satisfied and the signal D tisnot more informative about particular
elements ofv than others then (17b) reduces to (17a) with both ba and b^ as vectors.




14
Statistical Discrimination on the Basis ofRace
Firms observe race. Ifrace is correlated with productivity and firms violate the law and use
race as information, then race has the properties ofan s variable. To see the empirical implications
ofthis, partition s into two variables, Si and s2,where Si isan indicator variable for membership in a
particular racialgroup and 0 otherwise, and s2 isschooling.15 In thiscase, the model implies almost
immediately that the coefficient on Si does not vary over time ifthe interaction between z and tis
excluded from the model. Ifthis interaction isincluded (17a) implies that the time paths ofb,n and
bsa are
bjit b.io -

,

bs^bs^ - <bza2&l
where

and Oz,2are the coefficients on si and s2in the regression ofz on Sj and s2. Assuming

<I>2siisnegative, as itiswhen Si indicates that the person isblack and z isAFQT, father’s education,
or the wage ofan older sibling, then the wage coefficient on Si will rise over time.
In contrast, iffirms obey the law and do not use race as information, then in the econometric
model, race has the properties of a z variable. In the case inwhich race isthe only z variable and
one svariable, such as education, isincluded in the analysis, then the coefficient on z in equation
(11) corresponds to the coefficient on race. The model implies that if(i)race isnegatively related to
productivity (A < 0), (ii)firms do not statisticallydiscriminate on basis ofrace, and (iii)firms learn
over time, then the race differential will widen as experience accumulates. The intuition isthat with
learning firms are acquiring additional information about performance that may legitimatelybe used
to differentiate among workers. Ifrace isnegatively related to productivity, then the new
information will lead to a decline inwages. Ifeducation isnegatively related to race, then the
coefficient on education should fall over time.
What happens iffirms do not discriminate on the basis ofrace and one adds a second z
variable with a time varying coefficient to a model that contains race and an svariable? Let zi
denote race and z2denote the additional variable, and letbzi,denote the coefficient on race when
experience istand z2 isincluded in the model and letbzit*denote the corresponding coefficient when
15The elementofrcorrespondingtotheraceindicatorSiintheproductivityequation(1)is0 unlessconsumeror
employeetastesfordiscriminationreduceprofitabilityofemploying members oftheminoritygroup, asinBecker(1971).
(Even ifris0 racemay be negativelyrelatedtoproductivityifitiscorrelatedwith elementsofz,q,orq thataffect
productivity.)Presumably,firmsthatviolatethelaw and discriminateinresponsetotheirown prejudiceortheprejudice
ofconsumers orotheremployees might alsobewillingtouse raceasinformation. Employers who harborprejudice
againstcertaingroups may be especiallyunlikelytoform beliefsabouttheproductivity ofthosegroupsthatarerationalin
thestatisticalsenseused inthispaper.



15
z2isexcluded. Assume that 0{, =022 = 9 t where 0^ isdefined below (17) above. In
Appendix 3 we show that

/s t

-

/ a =- e

o t / a ■ [®^

]

where O z2 isthe coefficient on z2 in the regression ofAv + e on s, z, and z2and <3>Vi isthe
coefficient on Z\ inthe regression ofz2on zi and s. When zi indicates whether the person isblack
and z2isAFQT, father’s education, or the wage ofan older sibling,
variables are positively related to productivity, with

isnegative. Ifthese

> 0 then cb zll /d t -

cb zlt*/dt >

0. We

conclude that iffirms do not statisticallydiscriminate on the basis ofrace and race isnegatively
related to productivity, then (1) the race gap will widen with experience and (2) adding a favorable z
variable to the model will reduce the race difference inthe experience profile. W e wish to stress
that other factors that influence race differences in experience profiles as well as other forms of
discrimination will also influence the wage results.

2.2 Incorporating On-the-Job Training Into the Model:
The analysis so far assumes that the effects ofz and s on the log ofproductivity do not
depend on t.Human capital accumulation isincluded inthe model through the H(t) function but is
assumed to be “neutral” inthe sense that itdoes not influence the time paths ofthe effects ofs and
z.16 In the more general case, the time paths ofz and s depend on other factors as well as learning.
In this sectionwe firstconsider the effect that such dependence would have on OLS estimates ofthe
interactions between tand z and s. Then we discuss estimation ofa more realisticmodel that
includes both human capital accumulation and leaming/statistical discrimination. As we shall see,
there isno clean way to sort out the relative roles ofthese two mechanisms without data on
productivity.
Suppose that s iscomplementary with learning by doing or enhances the productivity of
investments in general skills. W e return to the case ofscalar z. Then the productivity equation (net
oftraining costs) might take the form
(18)

yt= r s + ns t+ H(t) + ciiq+Az + r|.

16 One may easilymodifythetheoreticalframework toallowforthisform ofhuman capitalaccumulation. Forexample,
theH(t)functionmay reflectlearningby doinginalljobsthatisobservabletofirms,orworkerfinancedinvestmentsin
human capitalthatareobservabletofirms.



16
Assuming that the training activity isobserved (firms know (18)) and workers pay for the
general training, the wage equation (9) becomes
(19)

w t= (bjt+ rit) s + baz + H(t) + aiq + E(Av + e|Dt)
Most discussions ofhuman capital and most ofthe empirical evidence on employer provided

training suggest that education makes workers more trainable and that educated workers receive
more training. In this case n will be greater than 0.17 Probit models ofthe probability that a
worker receives training in a given year show strong positive effects of schooling, and AFQT as well
as smaller but positive, statistically significant effect offather’s education. (See below.)
What are the implications ofthis for our investigation ofthe hypothesis that the reliance of
employers on easily observable variables to estimate productivity declines over the career? In
estimating the model we identify the sum bst+rist rather than b,t.Ifri isgreater than 0, then the
estimated relationship between b* + nst and twill be biased against the hypothesis that employers
learn about productivity. As itturns out, we find a strong negative relationship between brt+ nst
and t,which isonly consistent with a training interpretation ifeducation reduces learning by doing,
the productivity oftraining investments, and/or the quantity oftraining investments.
There isalso the possibilitythat the productivity ofemployer provided training and/or
learning by doing depends on z and/or t). This case isharder to analyze because employers do not
observe z and T|directly and are learning more about them as time goes on. As a start, we consider
the extreme case in which firms are fullyinformed about z, so that Gt is 1 and b* in (9) isa constant
inthe absence oftraining.
Suppose that the productivity equation is
(20)

yt= r s + ns t+ r2z t+ H(t) + atq +Az + q ,

r2> 0

Iffirm's knowledge of s and q isfully informative about z, then the presence of r2inthe productivity
equation should lead the effect of z on the wage to risewith experience even ifb* does not depend
on time (0t=l). However, the presence ofr2z tinthe productivity equation seems unlikely to lead
to a negative estimate ofd b j d i .
Itisimportant to point out, however, that ifthe effect ofz on y rises with tthen introducing
the interaction between z and tinto the wage equation could lower the estimate ofthe change over
17Earningsslopesdepend on theexpectedproductivityoftheworkerifthecostsorreturnstotrainingdepend on variables
suchaszors. Altonjiand Spletzer(1992)finda relationshipbetweentestscoresand measuresoftrainingusingthe
NLS72 dataset,and many studiesfinda linkbetween schoolingandtrainingmeasures. Seeforexample,Barteland
Sicherman (1992)and Lynch (1992).



17

time inthe wage response to s. Let Bstbe the expectation ofthe OLS estimator ofthe effect of s on
the wage in period t. Then Bstisbso + rjt+

r21,where $>*,isthe coefficient ofthe regression of

z on s. When one adds z-tto the regression, Bstbecomes bso+ rit. B*, the expectation ofthe OLS
estimator ofthe effect of z in period t,becomes bzo + r21. If

> 0 and r2>0, then

r2> 0.

The change inthe coefficient on s when z tisadded is -<J>zsr2t. Consequently, inthe scalar case the
simple training model with fullinformation about z implies that [dBst/dt] declines by - <t>zs [0Ba/0t]
when z tisadded to the wage equation.
In the pure employer leaming/statistical discrimination model 0B,t/0t isequal to d b j dt and,
according to proposition 3, the learning model also implies that 0B»t/0t declines by -Ozsdb^ -Oz,0t
when z tisadded to the wage equation. However, models differintheir implications for the le v e l
ofd B J d t after z tisadded. A pure human capital model with perfect information implies 0Bst/dt >
0 unless, in contrast to the available evidence, s has a negative partial effect on the quantity or return
to on-the-job training (n < 0).

Controlling for Training.
In the absence of data on productivity, sorting out the relative importance ofemployer
learning and non neutral (with respect to z and s) on-the-job training may require that one build a
model ofthe quantity oftraining as a function of s and z and use a proxy based on the training
model to control for the effects of non neutral general human capital accumulation inthe wage
equation. This raises a number of difficultiesthat we explore inthe next few paragraphs.
W e return to the case of scalar z. Assume that the productivity equation (net oftraining
costs) takes the form
(21)

yt= r s + 'F(STit) - C(Tit) + H(t) + a

lq + r \ .

where ETjt= ST=i..tTit,¥(.) isan increasing function that summarizes the effect of accumulated
training on productivity, and C(Tu) isthe cost interms ofthe log ofproductivity of Titunits of
training in period t, and the function H(t) has been redefined to accommodate the inclusion of
training. Assume that T;tisdetermined by employer beliefs about productivity given D t,q, s,and t,
as well as by D t,q, s, and experience. Then
(22)

Tit= h(Dt,q, s,t) = r(s,z,t) + ut




18
where r(s,z,t) is E(h(Dt, q, s, t)|s,z,t) and ut is an error term that is related to q and D, but is
assumed independent of s, z, and t. Following through on a series of substitutions that parallels
those leading to (8), and assuming that the worker pays for and receives the returns to the general
training yields the wage equation
(23)

w, = (r + Y2 + a 2)s + ¥ ( 2 Tix) - C(Tit) + H*(t) + A(Yl + ce,)q + E(Av+e |Dt) +

Suppose that up to an irrelevant constant 'F(2x=i.., Tix)) = vj/i Ex=i..t TjXand C(Tit) = Ci Tu.
Then the regression function relating wt to s ,z , 2 TiX) and Tu in period t may be written as
(24)

wt = (r + au)s + (H/i+a2t)2 Tix + (a3t -Ci)Tit + a4t z + H*(t) +

where an, a2t) a3t, and a4t are the coefficients of the linear least squares projection of A(yi + ai)q +
E(Av+e |Dt) onto s, 2 T *, Tit, and z, and the error term

is unrelated to the variables in the

model by definition of an a2t, a3t, and a*. The time path of au and a4t will be influenced by changes
over time in the correlations of s, z, and A(yi + oti)q with 2 T;x and Tit as well as changes over
time in the correlations of z, 2 TiX, and Tu with E(Av+e |Dt). (The coefficients of the experience
profile H*(t) will be influenced as well.)
Two implications follow from (23) and (24). First, even if training depends only on
information that is known to the firm at the start, the relationship between q and Tu and ET* may
change with t, leading to changes over time in the coefficient on s even if there is no learning. The
second point follows from the fact that training depends on Dtand so will be correlated with it. The
least squares estimates of the coefficients on the training variables will reflect both the direct effect
of training and a relationship between the time path of T;t and E(Av+e |Dt). As a result, the effect of
adding the training variables to the model on b* and b* is complicated in a mixed human
capital/employer learning model. In particular, one might expect the addition of functions of Tu and
2T;Xto the model to change and quite possibly reduce the rate of increase of b*, for two reasons.
First, the training variables change over time and are positively correlated with z. Second, they will
absorb part of the trend in E(Av+e |Dt), and it is changes in this term that induce the variation with t
in bsl and b*. Furthermore, the introduction of the training terms alters the partial correlation
between z and s, which changes the effect on the path of b8t of introducing z with a time varying
coefficient.

Unfortunately, we have do not have a way to isolate the effects of training

from the effects of statistical discrimination with learning if, as seems plausible, the quantity of



19

training is influenced by the employer beliefs about productivity. Consider the null hypothesis that
(1) learning is important, (2) variation with s and z in the rate of skill accumulation is not, and (3)
variation in our measure of training is driven by worker performance (which leads to promotion
into jobs that offer training) rather than by exogenous differences in the level of human capital
investment. Even under this hypothesis one would expect the introduction of the training measures
to lead to a reduction in the growth over time in the coefficient on z and a reduction in the impact of
z on the time path of the coefficient on s. With an indicator of ya, that problem is easily solved, but
we lack such an indicator.
Despite the absence of a clear structural interpretation of the results we think it is important
in this initial study to see how introducing measures of training alters b* and b * . Consequently,
below we report estimates of (24). There are two additional problems in using the training data.
First, the measure T*jt of Tit is almost certain to contain measurement error. Second, the quality of
the training data prior to 1988 is too poor to be used, which means that the data needed to form the
measure ET;t is missing for persons who left school prior to that year. We do not have a solution for
the first problem but deal with the latter problem by estimating a flexible model relating T*uto s, z,
and t using data from 1988-1993 and using the model to impute values in the earlier years.18We
estimate variants of (24) below. Our preferred specification is a wage growth model based on the
first difference of (24). The growth specification has the advantage of only requiring data on Tu and
Tin. Perhaps more importantly, this specification also eliminates bias from unobserved person
specific effects that are known to firms and are correlated with both training and wages.
3. Data

The empirical analysis is based on the 1992 release of NLSY. The NLSY is a panel study of
men and women who were aged 14-22 in 1978. Sample members have been surveyed annually
since 1979. (In 1994 the NLSY moved to a biannual survey schedule.) The NLSY is an attractive
data set for the study of employer learning and statistical discrimination. First, the sample sizes are
large. Second, sample members are observed at or near the start of their work careers and are
followed for several years. Third, the NLSY contains detailed employment histories, including
reasons for job changes. Fourth, it contains a rich set of personal characteristics that may be related
to productivity and may be hard for employers to observe, including father and mother's education
18 Spletzer and Lowenstein (1996) provide means ofdealing with measurement error inthe trainingdatabutthese are
beyond the scope ofour study.



20

and occupation, drug and alcohol use, criminal activity, AFQT, aspirations and motivation, and
performance in school. Furthermore, the data set contains a large number of siblings. The earnings
of older siblings as well as parents may be used as indicators of characteristics of younger siblings
that affect productivity but are hard for employers to observe. Finally, it contains measures of
training, which we need to investigate the possibility that variation with experience in the effects of
schooling and our measures of hard to observe personal characteristics are due to a relationship
between these variables and the quantity of training received.
We restrict the analysis to men who are white or black who have completed 8 or more years
of education. We exclude labor market observations prior to the first time that a person leaves
school and accumulate experience from that point. When we analyze wage changes, we further
restrict the sample to persons who do not change education between successive years. Actual
experience is the number of weeks in which the person worked more than 30 hours divided by 50.
Potential experience is defined as age minus years of schooling minus 6. To reduce the influence of
outliers, father's education (F_ED) is set to 4 if father's education is reported to be less than 4.
AFQT is standardized by age.19 The means, standard deviations, minimum and maximums of the
variables used in analysis are provided in Table A1 in the Appendix, along with the variable
definitions. The mean of actual experience is 4.9. The mean of potential experience is 7.3, and the
mean of education is 12.7. All statistics in the paper are unweighted. Blacks are over sampled in
the NLSY and contribute 28.8 percent of our observations. Table A2 reports correlation
coefficients and simple regression coefficients that summarize the relationships among the key
variables used in the analysis.
4. Results for Education

In Table 1-3 we report estimates of our basic wage level specification. In table 1 we use
potential experience as the experience measure and use OLS to estimate the model. The equations
also control for a cubic in experience, a quadratic time trend, residence in an urban area, and dummy

19 The age ofthe sample members atthe time the AFQT was administeredvaries somewhat in the NLSY sample. This
induces some variation in schooling levelsatthetime the AFQT istaken. To calculate standardized AFQT, we adjustthe
raw AFQT scoreby subtractingthe mean score foreach age and dividingby the standarddeviationforthatage. For
individualswith siblings inthe sample, the coefficientsofthe regressionofthe unadjusted testscore oftheolder sibling
on the testscore oftheyounger siblingand the regression ofthe testscore oftheyounger siblingon thethe score ofthe
oldersiblingarevery similarafterone also controls forage, suggestingthatthe information inthe testisnotvery
sensitivetoage over the range in the sample.



21

variables for whether father's education is missing and whether AFQT is missing. We add
interactions between the dummy variables for missing data and experience when interactions
between father's education and experience and AFQT and experience are added to the model.
These variables are not reported in the tables. Standard errors are White/Huber standard errors
computed accounting for the fact that there are multiple observations for each worker.
In column 4 we present an equation that includes s, Black, and sxt. This corresponds to (7a)
with bst restricted to b* = bso + bsi*t. The coefficient on s*t/10 is -.0077 (.0062), suggesting that
the effect of education on wages declines slightly with experience. In column 5 we add AFQT and
F_ED, where F_ED is years of father's education. As had been well documented, AFQT has a
powerful association with earnings even after controlling for education. A shift in AFQT from one
standard deviation below the mean to one standard deviation above is associated with an increase in
the log wage o f. 157. The coefficient on education declines to .080 and bsi becomes more negative.
In column 6 we add linear interactions between t and two different z variables, AFQT and
F_ED, to the equation. The resulting equation corresponds to (9) with the restriction that brt = b,o
+ b,i*t and bzt= bzo + bzixt, except that we introduce two z variables rather than 1. The estimates
imply that the effect of AFQT on the wage increases greatly with experience t. bAFtyn, which is the
coefficient on AFQTxt/10, is .0820 (.0125). bAFQTt, which is dwt/3AFQT, rises from only .0179
when experience is 0 to .0999 when experience is 10. The results imply that when experience is 10
and education is held constant, persons with AFQT scores one standard deviation above the mean
have a log wage that is .200 larger than persons with AFQT scores one standard deviation below the
mean, while the difference is only .036 when experience is 0. The effect of father's education also
increases with experience. The main effect is actually slightly negative (but not significant).
However, the interaction term is positive, though not statistically significant.
Our results for AFQT and F_ED are consistent with Farber and Gibbon's results in which
they use the components of AFQT and an indicator for whether the family had a library card when
the person is 14 that are orthogonal to the wage on the first job and education. The key result in the
table is that the coefficient on sxt/10 declines sharply (to -.0351 (.0069)) when AFQTxt and
F_EDxt are added. The implied effect of an extra year of education for a person with 10 years of
experience is only .0633. Strikingly, the coefficient on s rises to .0984 which is almost exactly what
we obtain when we exclude all terms involving F_ED and AFQT from the model (columns 1 and 4).




22

These results provide support for the hypothesis that employers have limited information
about the productivity of labor force entrants and statistically discriminate on the basis of education.
Early wages are based on expected productivity conditional on easily observable variables such as
education. As experience accumulates, wages become more strongly related to variables that are
likely to be correlated with productivity but hard for the employer to observe directly. When we
condition the experience profile of earnings on both easy to observe variables, such as education,
and hard to observe variables, such as AFQT and father's education, we find the partial effect of the
easy to observe variables declines substantially with experience. While one might argue that the
positive coefficients on AFQTxt and F_ED*t are due to an association between these variables and
training intensity, it is hard to reconcile this view with the negative coefficient on sxt. While
measurement error in schooling may enhance the effect of F_ED and AFQT and may partially
explain the decline in s between columns 1 and 3, it does not provide a simple explanation for the
behavior of the interaction terms with experience.
In Table 2 we present OLS results using actual experience in place of potential experience as
the experience measure t. The main difference between this table and table 1 is that the return to
education is lower and the s*t interaction is positive and fairly large in the equations that exclude
AFQTxt and FJEDxt. However, the coefficient on sxt/10 declines from .0200 in column 5 to .0056 when the interaction terms are added in column 6 of Table 2. This decline is similar to the
decline that we obtain in column 3.
The results in Table 2 are difficult to interpret, because the intensity of work experience may
be conveying information to employers about worker quality. It is an outcome measure itself. The
implications of employer learning for the wage equation are changed if one conditions on
information that becomes available to employers as the worker's career unfolds and may reflect the
productivity of the worker. Conditioning on actual work experience raises some of the issues that
would arise if we conditioned on wages in t-1 or on training received. On the other hand, the results
based on potential experience are likely to be biased by the fact that potential experience
mismeasures actual. For this reason, in Table 3 we report the results of re-estimating the models by
instrumental variables (IV), treating all terms involving actual experience as endogenous with
corresponding terms involving potential experience as the instruments. The results in columns 5 and
6 of Table 3 are basically consistent with those in Table 1. The coefficient on AFQT is .0177
(.0096) and the coefficient on AFQT-t/10 is .1148 (.0164). These estimates imply that conditional
on years of schooling, AFQT has only a small effect on initial wages, but when t is 10 a two



23

standard deviation shift in AFQT is associated with a wage differential of .247. The coefficient on
s-t/10 declines from -.0181 when the interactions are excluded in column 5 to -.0561 in column 6.
Controlling for Secular change in the Return to Education
In column 9 of Tablesl, 2, and 3 we add the interaction between s and calendar time to the
model containing father's education and AFQT.20 In the case of potential experience in Table 1, the
education slope is reduced by .02 per year, and the interaction between education and experience/10
drops to -.051, but otherwise the results change little. In column 10 we add the interactions
between calendar time and s, F_ED, and AFQT to the model containing the interactions between t
and all three variables. In column 10 the interactions between F_ED and AFQT and calendar time
have positive coefficients, indicating that the effects of these variables rose during the 1980s.
Adding the time interactions reduces the size of the experience interactions with F ED and AFQT,
but the qualitative pattern of the results does not change.
Controlling for Occupation
One objection to the theoretical framework underlying the estimates in Tables 1-3 is that it
assumes that the flow of information to employers is independent of the type of job the worker
begins in. This is contrary to the idea that some jobs are "dead end" jobs. Perhaps education (and
high AFQT) enables a worker to gain access to jobs in which firms have the ability to observe
whether the worker has higher level skills that are strongly related to productivity. As a simple
check on this possibility, we present a series of equations in Table 4 that control for the 2-digit
occupation of the first job. The results are very similar to what we obtain when occupation is
excluded.21.
The Effects of the Wage of a Sibling

20Murphy and Welch (1992), Katz and Murphy (1992), Taber (1996) and Chay and Lee (1997) are among a large
number ofrecentstudy ofchanges in the structureofwages intheU.S.. Since calendartime ispositivelycorrelatedwith
experience tina panel data set,the learning/statisticaldiscrimination model implies thatestimates ofsecularchanges in
the return toeducation and AFQT willbe biased inoppositedirectionsifone failstoadd the interactionbetween these
variablesand ttothe model.
21 An interestingprojectforfuture research would be touse informationfrom the Dictionary ofOccupational Titleson
skillrequirements ofoccupations and tracehow easy to observe and hard toobserve productivitycharacteristicsare related
tochanges over a career in the skillrequirements ofthejob a worker holds. Itwould alsobe interestingtoexamine how
the slopes are influencedby the skillrequirements ofthe initialoccupation heldby the individual.



24
In Table 5, we use the wages of siblings with 5 to 8 years of experience as a hard to observe
background characteristic. The coefficient on sxt/10 is -.0097 (.0089) in column 4, which includes
the log of the wage of the oldest sibling. The learning model does not provide an explanation for
the negative interaction term, nor does the conventional view of how education is related to on-thejob training. However, when we add the interaction between the sibling wage and t in column 5, the
coefficient on the education interaction falls to -.0146, and the coefficient on the interaction between
the sibling wage and t/10 is .086 (.0327) . 22 The effect of the sibling wage rises from .127 upon labor
force entry to .213 after 10 years of experience—a very large increase. The point estimate of the
interaction between education and experience result is essentially unchanged when we allow the
effect of sibling wage. In Table 5, columns 5 and 6, we show that these results are robust to
allowing the effects of education and the sibling wage to depend on calendar time. Our
interpretation of these results begins with the premise that the labor market productivity of siblings
are correlated. As a worker acquires experience this correlation is reflected in the performance
record Dt and in wage rates. The sibling wage is correlated with education, and so the effect of
education on the wage declines with experience because firms are estimating productivity with a
bigger information set than at the time of labor force entry.23
The Experience Profile of the Effects of AFOT and Education on Wages
In this section we take a more detailed look at how the effects of AFQT and s vary with
experience by estimating models of the form
wt = f(z,t;bz) + h(s,t;b,) + H(t) + eit
where bz and b, are now vectors of parameters. Table 6 is based on models in which f(z,t;bz) and
h(s,t;b,) are quartic polynomials in t. In the top panel, the experience measure is potential

22The corresponding point estimates are -.022 and .080 when we allow the effectsofeducation and the siblingwage to
depend on calendartime.
23Farber and Gibbons (1996) use men and women, include Hispanics, and restricttheirsample topersons who have
worked atleastthreeconsecutiveyears since attending school. Using thissample thecoefficientson AFQT*t and the
effecton s*t ofadding AFQTxt are similarto those reported above. We alsoobtain qualitativelysimilarresultswhen we
followFarber and Gibbons and use the levelofwages ratherthan the log. We experimented with an indicatorfor
whether any person in the respondent's household had a librarycard atthe time the respondent was 14, a variablewhich
Farberand Gibbons also used. We confirm Farber and Gibbons' finding thatthe coefficienton the residual from a
regressionofthisvariable on the initialreal wage, education, part-time status, an interactionbetween education and parttime status, race, sex, age, and calendaryear increaseswith experience, as well as theirfindingthatthe resultsforlibrary
card and AFQT are weakened substantiallywhen thesevariablesare interactedwith calendartime. However, when we use
the librarycard variable itselfthe effectofthe librarycard variablefallsratherthan riseswith experience. We thank
Henry Farber forassistingus in reconstructing the Farber and Gibbons sample.



25
experience; in the bottom panel we use actual experience instrumented by potential experience. All
of the models in the tables contain the other control variables discussed above. They also include
F ED and F_EDxt.
The columns report dwt/dAFQT, 02wt/3AFQT, dt dwjds, and d2wt/dsdt at various experience
levels. The first column of the table shows that <9wt/c3AFQT increases steadily from .0197 when t is 0
to . 121 when t is 12. (We only go out to t=12 because sample information becomes thin at higher
values.) The specification that we use in most of the paper, in which f(z,t;bz) and h(s,t;b„) are linear
in t (column 6 in tables 1-3), suggests an increase in dwt/3AFQT from .0179 to . 116 as t goes from
0 to 12.
As noted earlier, employer learning implies that dwt/dAFQT is nondecreasing in t ( i.e.,
d2wt/3AFQT,dt >0), with a strict inequality likely if some new information arrives each period on y.
If the noise in observations of ytare iid, then the rate of increase d2wt/3AFQT,dt should decline with
t, as shown in expression (12c) for 0t above. The rate of increase must decline eventually because
the amount of additional information in observations of labor market performance is declining. (0t is
bounded at 1.) However, it is possible that the first two or three observations on a worker are
particularly noisy because of factors that we have left out of the model. For example job specific or
occupation specific match quality may be more variable for new workers than more experienced
ones.
In column 2 we report d2wt/dAFQT,dt for various experience levels. The values increase
from .0025 when t is 0 to .0104 when t is 5, remains at about this level until t is 8 (the maximum is
.0108 at t = 6.5) and then decline to .0048 when t is 12. These results are reasonably consistent
with a decline in the amount of new information with experience after an initial period of noisy
observations.24
In panel B we replace potential experience with actual experience, and treat actual
experience as endogenous. The 99th percentile value for this variable is only 13.33, so there not
much sample information on t beyond this point. In column 1 we see that the effect of AFQT
increases with experience. The rate of increase d2Wt/<9AFQT,dt rises at first from .0092 when t = 0
to .0138 when t=5, but declines to -.0012 when t = 12. However, the standard errors on these*25
24 We

used two other non-linear specifications. The firstused splinefunctions with break points att=2, t=4, t=7,and t=10.
In the second we restrictedf(z,t;b2)so thatdV/SAFQT 5t= 0 when tis25 and h(s,t;b,) so thatdVt/ds 3t= 0 when tis
25. The idea isthatthe information about productivitythatiscontained in AFQT isMly revealedby thetime tis25.
Both ofthese specificationsyielded results similar to the reported model inwhich 32wt/c?AFQT dt isflator increasingand
then definitivelydecreasing afterabout 7 years.




26
derivatives are quite large. These results are also loosely consistent with the proposition that the
rate at which new information about initial productivity arrives declines with experience, but the
estimates are not sufficiently precise to say much about this. As the NLSY sample ages, it will be
interesting to revisit the issue.
In the model with potential experience, the return to education increases slightly between
t=0 to t=3, and then declines sharply. In the model with actual experience, the decline is constant
throughout from .0881 at no experience to .0299 at 12 years of experience.
Testing the restrictions on the experience profiles of the effects of s and z on the wage.
It is interesting to see how well the experience profiles of the education and AFQT coefficients
satisfy the restrictions in propositions 3 and 4. One complication in performing these tests is the place
of race within our model — should we treat race an s variable or a z variable? The answer to this
question hinges on the extent to which employers violate the law and use race as an indicator of
productivity. We discuss this at length in section 5 below. For now we will side step the issue by
running separate tests on the white and black samples.

Consider first a specification in which s and z

are both scalars, education and AFQT score, respectively. Proposition 3 says that the product of
-cov(s,z)/var(s) — the negative of the coefficient of the regression of z on s — times the coefficient on
the interaction between AFQT and experience (z*t) should equal the coefficient on the interaction
between education and experience (s*t). In the white sample, the product is -.00162 and the coefficient
on s><t is -.00232. A Wald test does not reject the proposition. In the black sample the corresponding
numbers are -.00196 and -.00498 and the proposition is rejected.25
We might also want to test whether the entire profile of the interactions between s and t and
between z and t are in accordance with proposition 3. One way to do this is to estimate the model
in which the interactions are specified as fourth-order polynomials and jointly test whether the
coefficients on the four polynomial interactions are related by the coefficient of the regression of z
on s. This seems a bit restrictive in that we only expect the relationship to hold over the range of
observed data and polynomials that have very different coefficients can be fairly similar over a short
range. However, we performed these tests on models in which the interactions of AFQT and
education with experience are modeled as fourth order polynomials. Once again, we fail to reject
the proposition for whites but reject for blacks.

25Itshouldbe noted thatthe standard errors forthese testsdo not account forpossible heteroscedasticityin thedata.



27
We also tested proposition 4, the vector analog of proposition 3, on models which include
both AFQT and father’s education. We also considered as z variables the dummy variables
indicating whether these quantities were known. This test amounts to a t-test of whether sum of
the products of -cov(s,z)/var(s) and the coefficient on zxt for each z variable is equal to the
coefficient on s*t. For whites, the sum of the products equals -.00193, the coefficient on s*t is
-.00254, and the proposition is not rejected. For blacks, we obtain -.00166 and -.00456 and reject
the proposition.
Wage Growth Equations
In Table 7 we estimate (9) in first difference form. We restrict bstto be b^ + b,i t and b* to
be bzo + bzi t. The usual reason for working in first differences is to eliminate correlation between
the regressors and a fixed error component. This motivation is not compelling in the present case.
However, it is possible that the first difference specification may be less sensitive to errors in
identifying when individuals start their careers.
Columns 1-4 report OLS estimates with potential experience. The coefficient on the sxAt
will pick up the effects of secular changes in the return to education as well as the changes with
experience in the return to education. The upward secular trend in the return to schooling may
partially explain the fact that the s*At has a positive coefficient in the basic model in column 1 while
it is negative for the corresponding level specification in Table 1, column 4.26 (A secular trend in the
return to education or AFQT matters less when estimating the equations in levels because much of
the variation in experience is across persons of different ages). Also, the estimates are much less
precise when we estimate in first difference form. However, the key results are qualitatively similar
to the level specifications. In particular, the coefficient on s/10 declines from .0148 (.0094) in
column 1 to -.0092 (.0110) when we add the AFQT and F_ED interaction terms in column 2. The
size of the decline in this coefficient is very similar to the drop in the coefficient on sxt when we add
AFQTxt and F_EDxt to the level specifications. (See columns 5 and 6 of Table 1). The AFQT
interaction term is positive with a t value of 3.4. The F ED interaction is also positive and similar
in magnitude to the result obtained in levels, but it is not statistically significant.
Columns 5-8 reports IV estimates of wage growth equations using actual experience as the
experience measure. The coefficient on AFQTxAt/10 is .0905 (.0197), which compares to the value
26See Muiphy and Welch (1991) and many subsequent studies. Mumane etal (1995) provide evidence ofan increase in
the returntoaptitude and achievement, as measured by tests.



28
o f. 1148 in Table 3, column 5. The coefficient on sxAt/10 declines from .0295 (.0079) to -.0030
(.0100) when AFQTxAt/10 and F_ED><At/10 are added.
S. Do Employers Statistically Discriminate on the Basis of Race?
Thus far we have focused the discussion on employers' use of education as an indicator of
labor market productivity. In this section we examine the role of race. By almost any measure,
young black men are disadvantaged relative to whites in the U. S.. On average, black males have
poorer, less educated parents, are more likely to grow up in a single parent household, live in more
troubled neighborhoods, attend schools with fewer resources, and have fewer opportunities for
teenage employment than white males. Many of these factors are correlated with educational
attainment and labor market success. They are likely to lead to a black/white differential in the
average skills of young workers. Discrimination in various forms may further hinder the
development of human capital in black children, and add to a gap in skills that is due to the race
difference in socioeconomic background. The gap in some indicators of skill are very large. In our
sample, the mean percentile score on the AFQT for the black sample is 23.78 while the mean for
whites is 53.27.

Neal and Johnson (1996) and a number of earlier papers have shown that in the

NLSY sample of men a substantial part of the race gap in wages is associated with the race gap in
AFQT.
If pre-market discrimination is an important factor in a gap between the average skills of
black and white workers, then it seems likely that various forms of current labor market
discrimination contribute to race differences in wages that are unrelated to skill. However, it is
nevertheless interesting to examine the possibility that a correlation between race and skill might
lead a rational, profit maximizing employer to use race as a cheap source of information about skills
and statistically discriminate on the basis of race. Such statistical discrimination along racial lines
can have very negative social consequences and is against the law. However, such discrimination
would be difficult to detect.
A statistically discriminating firm might use race, along with education and other information
to predict the productivity of new workers. With time, the productivity of the worker would
become apparent and compensation would be based on the larger information available rather than
the limited information available at the time of hire. Consequently, if statistical discrimination on the
basis of race is important, then adding interactions between t and z variables such as AFQT and




29

father's education to the wage equations should lead to a positive (or less negative) coefficient on
black ><t and should lead to an increase in the race intercept. As noted in section 2, if firms use race
as information then it behaves as an s variable in the model and the logic is the same as in our
analysis of the effect of education. On the other hand, if firms do not use or only partially use race
as information, then a race indicator behaves as a z variable. As discussed in Section 2, in this case
the race gap should widen with experience if race is negatively related to productivity, and adding a
second z variable that is negatively related to race will reduce the race gap in experience slopes.27
The race differential in our basic specification in column 1 of Table 1 is -.1801. This drops to
-.0969 when AFQT, F_ED, and education*t are added to the equation (column 5). When Black*t/10 is
added in column 6, it enters with a coefficient of -. 1456 (.0216). This coefficient is consistent with the
hypothesis of no or very limited statistical discrimination on the basis of race and inconsistent with the
hypothesis that firms make full use of race as information. The coefficient on Black is insignificantly
different from 0, although the models do not provide a clear prediction about the sign of this variable,
since race may be correlated with information in q that can legally be used. The fact that coefficient on
Blackxt/10 rises to -.0816 when F_EDxt and AFQTxt are added to the equation (column 8) is not
informative about whether or not firms make full use of race as information.28
We obtain similar results using alternative experience measures in Tables 2 and 3. In Table 4,
columns 7 and 8 we obtain similar results after controlling for initial 2-digit occupation. We obtain
similar results using growth equations in Table 7, which should be robust to the presence of an economy
wide time trend affecting the return to education, race, and AFQT. However, in the level equations we
find that the results for race are sensitive to treatment of economy wide time trends. When we use
potential experience as the measure of t the coefficient on Blackxt declines only slightly ( from -.0146
to -.0144) when we adding time trend interactions involving race and AFQT to the wage level equation
corresponding to Table 1, column 7, but the race-experience interaction no longer drops when AFQT
and experience is added. (Not reported.)
27The learning model in section 2 implies thatdifferences across groups inthe associationbetween sand the z variable
will leadtogroup differences intheb* and bacoefficients. We have notexploredthisempirically. An obstacletodoing
so isthatthe resultsmight be sensitiveto the linearityassumptions thatwe have made.
28Japanese and Chinese Americans score higher on aptitude and achievement teststhan whites. Our analysispredicts
thatiffirms statisticallydiscriminate on the basis ofraceand ethnic background then the addition ofAFQT and AFQT*t
toan equation containing a dummy and experience interactionterm forthesegroups will leadto an increase in the
dummy variable and a reduction in the experience interaction. Sample sizesdo not permit an analysisofthesegroups.
While one could differentiateamong whites based on ethnicity (seeBoijas (1992), itisnot clearthatthese ethnic
differencesare observable toemployers. Our methods could be used to investigate statisticaldiscrimination on thebasis
ofattending prestigous colleges orparticularcollege majors.



30
We wish to stress that the simple model of statistical discrimination cannot explain the negative
coefficient on Blackxt unless firms do not make full use of race as information. The accumulation of
additional information during a career that can legally be used to differentiate among workers is fully
consistent with our results. However, there are several other explanations of the race differences in the
experience slope in the literature that may be at work here. It is also important to point out that the
results for Black and Blackxt alone (i.e., ignoring the behavior coefficients of the coefficients on
education and educationxt) are potentially consistent with a story in which firms are fully informed,
AFQT is positively associated with on-the-job training, and the race difference in AFQT is partially
responsible for a race differential in wage growth. Adding AFQTxt would reduce a negative bias in
Blackxt associated with differential training levels. The increase in Blackxt when AFQTxt is added to
the model would lead to a fall in the coefficient on Black. As we report below, we obtain qualitatively
similar results when we add controls for employer training, but these controls reduce the magnitude of
the coefficient on Blackxt and the effect of adding AFQTxt on the coefficient on Blackxt.
Another potential test of whether race is used to statistically discriminate or not is to see
whether proposition 4 holds either when race is treated as an s variable or when it is treated as a z
variable. To do this, we use the model in column 8 of table 1. With race treated as an s variable,
we regress the z variables (AFQT, father’s education, and the dummies for not knowing these
quantities) on the two s variables. We sum the product of these coefficients and the coefficients on
the zxt interactions in the main regression and compare them to the coefficients on the sxt
interactions. We can then conduct a joint test of whether these two quantities are equal. For the
education interaction the sum of the products equals -.00183 while the model coefficient is -.00301.
For the race interaction, the two terms have opposite signs; the sum is .00644 while the model
coefficient is -.00816. Not surprisingly, the proposition is soundly rejected.
When we treat race as a z variable, we begin our test by regressing the 5 z variables on
education, our s variable. Here, we have only one restriction to test. The sum of the products
equals -.00215 while the model coefficient equals -.00301. The proposition can be rejected at
conventional levels of significance (the P-value is .027) but with corrected standard errors this will
probably not be the case. This is a further indication that employers are not treating race as
information, or at least not fully.6




6. Models with Training

31

In Table 8 we report estimates of equation (24) along with models that exclude the training
variables. In these models we have excluded father's education. In the basic model in column 1 the
coefficient on s*t/10 is -.0102. In column 2 we add Tt and

to the equation. The variable Tt

has the expected negative sign of -. 1044 (.0179), while ST;t has a coefficient o f. 1864 (.0114). The
coefficient on s*t/10 falls to -.0346. The coefficient on AFQT falls from .0828 to .0582 while the
coefficient on education rises slightly. The substantial negative experience slope on education might
be consistent with a human capital story in which knowledge obtained in school depreciates over
time unless one receives training. In column (3) AFQTxt/10 enters with a coefficient of .0502
(.0125), and the coefficient on and sxt/10 drops from -.0358 to -.00427. These changes are
consistent with employer leaming/statistical discrimination. If we reverse the order in which the
variables are added by adding AFQTxt before the training measures, the marginal effect of the
training measures on educationxt is much smaller. We have also estimated separate models for
blacks and for whites and obtain a similar pattern.
In Columns 4-6 we investigate the effect of introducing the training measure on the race gap
in wage slopes. The coefficient on blackxt/10 declines from -. 1467 to -. 1048 when we add the
training measures. Adding AFQTxt/10 leads to a further decline to -.0777.
To reduce the difficulties associated with the lack of data on training in the early years of the
study and individual heterogeneity that is correlated with both training and wages, we turn to a first
differenced version of (24). In the first difference version the current and lagged values of T, enter.
These results are in Table 9. The coefficient on educationxt/10 declines from .0126 (.0094) to
.0073 when the training measures are added. The coefficient on Black rises from -.0995 (.0351) to
-.0923 (.0353). However, the coefficient on T, is positive while the coefficient on Tt-i is negative.
These signs are inconsistent with a simple human capital model but are consistent with an EL-SD
model in which training opportunities are given to more productive workers and learning about
productivity occurs over time. Adding the training variables to a model that contains AFQT and
F_ED has little impact on the coefficients on these variables. (Compare columns 2 and 4.)
Imprecision in the training measures may partially explain this fact, but does not provide an
explanation for the sign pattern in the training coefficients. The coefficients on sxt and Blackxt
decline in absolute value when AFQT and F_ED are added, as is predicted by the EL-SD. Overall,
the wage change results are quite consistent with an important role for EL-SD
We view the evidence as consistent with a role for both human capital and EL-SD, but
cannot make a precise statement about the relative contribution of these factors because, as



32
discussed above, training will be influenced by new information about employee performance and
the quality of the training data is suspect.

7. Information Transmission Across Firms:

The formal model that we have used to interpret the results assumes that employers have the
same information about workers. The results suggest that information about productivity does
eventually get reflected in wages. However, they do not identify whether these adjustments occur
primarily in the current firm, presumably in response to outside pressure from competitors who have
information about the worker, or through moves to other employers with associated wage increases
for workers who do not move.29 In this section we briefly examine the issue of information
transmission across firms.
A number of theoretical papers discuss whether information about productivity will be
reflected in promotion paths and wage increases within firms, as well as the strategies firms might
use to try to hide information about good workers (e.g. Greenwald (1986), Waldman (1984),
Lazear (1986), Gibbons and Katz (1991)).30 Unfortunately, the theory is ambiguous about whether
a firm's private information concerning the worker will be reflected in wages offered by that firm to
incumbent workers and about the mechanism that induces the firm to adjust wages. In some private
information models in which only wages and perhaps position within the firm are observable to
outside firms, the employer’s information is not reflected in wages until the worker gets an outside
offer. In Waldman (1984) it is reflected in wages after the firm reassigns the worker to a position in
which output is more sensitive to ability. In Gibbons and Katz it is reflected in wages if the firm
chooses not to lay off the worker. The firm lays off low productivity workers, who are hired by
other firms at lower wages. Outside firms infer that the remaining workers are of higher quality,

29 Although we do notknow ofsystematic evidence on this,casual empiricism suggests thatchanges in the legal system
have ledsome firms toadopt the explicitpolicy ofnotproviding referencesforformer employees. Also, increasedfiring
costsand concern about litigationmay have made employers more reluctanttodischarge workers forpoor performance.
Statisticaldiscrimination may become a more seriousproblem ifinformationflows are restricted. This may leadfirms to
relatecompensation toperformance more explicitly,with more turnoverbeing a "voluntary" responsetobelow average
wage increases. On the otherhand, difference inwages across groups may be attenuatedbecause firms may be reluctant
toopen up largewage differentialsbetween persons with similareducation, seniority, and experience. Itispossiblethat
thebalance between these two considerations has changed over time.
30None ofthisliteratureconsiders the implications ofthe possibilitythatemployers and co-workers acquire reputations
forhow positivethey are in promoting thecareersofindividualsor thatthe incentives ofco-workers and even supervisors
tokeep favorable information about a colleague privateor in concealingunfavorable information from associates outside
the firm may be quite differentfrom those ofthe employer. These factorswould undermine the case thatfirmswould want
toand be abletokeep inside information inside the firm.



33

which forces the employer to raise wages of those who stay with the firm. Both models have the
implication that hard to observe variables like AFQT, F_ED, and the wage of an older sibling should
be positively related to wage growth if one does not condition on whether a person was laid off or
not. This is what we found above.
Gibbons and Katz (1991) provide empirical support for the hypothesis that layoffs should be
negatively related to wage growth. But there are a number of other reasons why layoffs should be
negatively related to wage growth (labor market conditions, lost seniority, for example). To obtain
more focused tests, we interact personal characteristics that are hard for employers to observe
directly with indicators for layoffs and discharges. The coefficients on these variables should differ
from the coefficients on characteristics that affect productivity and are easy for employers to
observe, such as years of schooling if ( 1 ) layoffs occur for multiple reasons, some of which have
nothing to do with the worker, (2) the probability that a layoff reflects low worker specific
productivity relative to the wage is related to z variables, and (3) outside employers have
information about the nature of the layoff or obtain information (through references, for example)
about productivity.
This suggests an equation of the form
wt - wt-i = 3o+ Layoff3i + zp2 + z[Layoff,] 03+ z[Layofft]t (34+ other controls.
If knowledge acquired by firms is reflected in wages, then p2 should be nonzero, and p3 and p4
should be near zero. If knowledge acquired by firms is not reflected in wages, then p2 should be
small and P3 and p4 should be nonzero. Given sample size limitations we have estimated a simplified
version of the above equation on the sample of layoffs only, with zfLayoffi ]t excluded:
wt - wt.i = (Po + Pi) + z p3+ other controls.
Our evidence on whether hard to observe variables such are positively associated with layoff losses is
weak at best. In fact, we find that losses are larger for persons with high AFQT. We have not
controlled for labor market conditions, and among the sample of layoffs they may be correlated with
AFQT. 31

31 We investigatedwhether the finding thatwage losses risewith AFQT isdrivenby a positive correlationbetween AFQT
and employment ina white collar, non unionjob, where layoffsareleastlikelytobe influencedby seniority rules.
Gibbons and Katz note that layoffsare likelytobe a particularlynegative signalforwhite collarworkers and restricttheir
analysis tothem. However, splittingthe leads to an even more negative coefficientforwhite collarworkers than forblue
collarworkers.




34
In Table 10 we report estimates of the effect of AFQT and F ED on employer initiated
separations. These include layoffs, firings, and plant closings. Our results were not very sensitive to
distinguishing among these three types of job loss. We find that AFQT has a weak negative effect on
the probability of losing one’s job, even after conditioning on seniority in the firm. However, when
seniority is controlled for a swing of two standard deviations in AFQT changes this probability by
.02, which is only 1/5 of the mean layoff rate o f. 1. We obtain similar results when the seniority
control is dropped.
Our results suggest that only a small part of the rise with t in the effect of AFQT on wages
operates through an association between AFQT and layoffs and the wage losses experienced by
those who are laid off.
Lazear (1986) presents a model in which both the current firm and outside firms observe
indicators of the productivity of the worker. His model predicts that workers with favorable
productivity traits that are hard to observe directly will be more likely to receive outside offers and
more likely to quit than workers whose hard to observe characteristics make them less productive.
In results not reported we find that F_ED is positively related to the quit rate conditional on
education and experience and tenure. AFQT does not have a significant effect. Neither AFQT nor
FJED is significantly related to wage growth among those who quit. (Not reported).
These results tentatively suggest that information flows in the labor market are sufficient to
force a firm to differentiate among workers as the firm obtains better information about their
productivity. A careful investigation will require a separate paper..8
8. The Potential for Testing Services to Certify Skill

Our estimates provide information about the rate at which employers learn about worker
quality. In Altonji and Pierret (1996) we use our empirical estimates to explore the implications of
the rate at which employers learn about worker quality for the empirical relevance of the educational
screening hypothesis. We show that even if employers learn relatively slowly about the productivity
of new workers, the portion of the return to education that could reflect signaling of ability is quite
limited. While education may be too expensive to serve as a means for able workers to certify
themselves to employers, perhaps other mechanisms could perform this function, at least for some
determinants of productivity. Here we point out that interpreting our estimates of the time profile of
the effect of AFQT on wages as the result of employer learning implies that high ability workers




35

would have a substantial financial incentive to take the AFQT to differentiate themselves from those
who are less able in this dimension.
Suppose that a third party were to administer the AFQT and certify the results to outside
employers, in much the same way that the Educational Testing Service administers the SAT exams.
Using our estimates of the learning profile and assuming that firms know all of the information
contained in AFQT by the time experience is 15, we have computed how much a person who believes
that he is 1 standard deviation above the mean for the AFQT would pay to take the test at the time he
enters the workforce.32 The OLS estimates using potential experience underlying Table 6, panel A,
column 3) imply that if firms become fully informed about productivity by the time experience is 15 and
the interest rate is . 1, then the person would be willing to pay .559 of the first year's salary for the test.33
The corresponding value when we use potential experience as an instrument for actual experience
(panel B, column 3) is .330.
These calculations raise the issue of why such a testing service has not emerged if
information is initially imperfect. One answer is that firms are not aware that the AFQT captures
characteristics that have a strong association with productivity. It is only recently, with the
availability of the NLSY, that labor economists have become aware of this. Another is that it would
be difficult for a testing firm to become established at a national level. A third is that, given race
differences in the distribution of AFQT scores, firms who make use of AFQT information in hiring
for a specific job would have the burden of establishing that they are relevant to productivity in that
job or run the risk of violating discrimination laws. This would be true even if individuals provided
firms with the test results. However, we do not find these answers to be fully satisfactory.34
Analyses based on variables such as the wage rates of siblings or father's education may be less
vulnerable to this objection. In any event, we should also point out that our estimates of the AFQTexperience profile are sensitive to treatment of time trends and training, so that financial return to
being certified as high AFQT is probably substantially less than the above numbers imply.

32Ifa worker did not know his ability, he could take a practiceteston hisown. Presumably, thiswould not raisethetotal
costofthe testvery much.
33Here we are assuming thatonly 1worker takesthe testand ignoring thefactthatthe composition ofthepool ofworkers
who choose totake the testin equilibrium would influence returnfora particulartype ofworker.
34 Note also that inthe absence ofan institution such as the Educational Testing Service, a firm might providethetest.
Some firms perform theirown testing.. However, ifthe resultswere available tothe employees or otherfirms know thata
particularfirm testsitsemployees, then the firm would not be abletocapture the fullreturn totesting.



36
9. Conclusion

This paper explores the implication of the premise that firms use the information they have
available to them to form judgments or about the productivity of workers and then revise these beliefs
as additional information becomes available. This a premise that seems natural to us and receives some
strong empirical support in Farber and Gibbons (1996). If profit maximizing firms have limited
information about the general productivity of new workers, then they may use easily observable
characteristics such as years of education or race to statistically discriminate among workers. We show
that as firms acquire more information about a worker, pay may become more dependent on
productivity and less dependent on easily observable characteristics or credentials. This basic idea is
quite general and provides a way to test for statistical discrimination in the labor market and elsewhere
in situations in which agents learn, such as credit markets.
We investigate it empirically by estimating a wage equation that contains interactions
between experience and hard to observe characteristics such as AFQT and father's education along
with the interaction between experience and a variable that firms can easily observe, such as years of
education. We assume that all three variables are related to productivity. We find the wage effect
of the unobservable productivity variables rise with time in the labor market and the wage effect of
education falls. These results match the predictions of our model of statistical discrimination with
learning.
We use a similar methodology to investigate whether employers statistically discriminate on
the basis of race. If our model is taken literally, the small race differentials for new workers and the
spread in the race gap with experience is most consistent with the view that race is negatively
correlated with productivity and the productivity gap becomes reflected in wages as firms acquire
additional information that can legally be used to differentiate among workers. We wish to stress
however, that other factors are probably as or more important in differences between whites and
blacks in wage profiles, and race differences in human capital accumulation accounts for at least part
of our findings. Also, our empirical results for race are sensitive to treatment of economy wide
changes in the effects of race, AFQT, and education. Future research should also address the large
race gap and education gap in employment rates, particularly for young workers. In situations in
which there are alternatives to the conventional labor market and employees in the alternative sector
do not acquire work histories that have value or are informative to firms in the conventional sector,




37
then statistical discrimination of the type described above may reduce participation rates of the
disadvantaged group in the conventional labor market.
It is worth emphasizing that the analysis in the paper suggests alternative interpretations of
empirical models of wages and other outcomes that involve experience interactions. It will be useful
to re-examine the results of other studies that included interactions between experience and easy to
observe variables such as schooling, race, gender, and experience in equations that also contain
interactions between experience and harder to observe background measures. We have not been
successful in sorting out the relative importance of differences among workers in training on one
hand and statistical discrimination with learning on the other for our results. This is an important
area for future research.
An important and reasonably straightforward extension of the analysis is to other easily
observable and hard to observable background characteristics. For example, do firms statistically
discriminate on the basis of the neighborhood one is from or on the basis of the reputation of the
high school, college, or graduate school one attends? A study of whether new immigrants are
judged by the average skills of their countrymen in the U.S. would be a natural step in the research
by Boijas (1992) and others documenting differences among immigrants in labor market success.
These issues are researchable using the approach developed in this paper. Finally, it would be useful
to apply the methods of the paper to other labor market outcomes in addition to wages.




38
References
D. Aigner and G. Cain (1977), “Statistical Theories of Discrimination” Industrial and Labor
Relations Review.
Albrecht, J. (1981), “A Procedure for Testing the Signalling Hypothesis.” Journal of Public
Economics, pp. 123-32.
Altonji, J. G. and C. R. Pierret (1997), “Employer Learning and the Signaling Value of Education,” in
Ohashi, Isao and Tachibanaki, Toshiaki, (ed.) Internal Labour Markets. Incentives and
Employment. Macmillan Press Ltd.
Altonji, J.G. and J. Spletzer (1991), “Worker Characteristics, Job Characteristics, and the Receipt
of On-the-Job Training.” Industrial and Labor Relations Review. 45(1) pp. 58-79.
Bartel, A. P., andN. Sicherman (1993), “Technological Change and On-The-Job Training of
Young Workers” unpublished paper, Columbia University.
Becker, G. S. (1971), The Economics of Discrimination. 2nd ed., Chicago: University of Chicago
Press.
Boijas, G. (1992), “Ethnic Capital and Intergenerational Mobility” Quarterly Journal of Economics
,117(1) pp.123-150.
Carmichael, H. L. (1989). “Self-Enforcing Contracts, Shirking, and Life Cycle Incentives” The
Journal of Economic Perspectives 3 (Fall): 65-84.
Chay, Kenneth Y. and David S. Lee, (1997) “Changes in Relative Wages in the 1980s: Returns to
Observed and Unobserved Skills and Black-White Wage Differentials”, unpublished paper.
February.
Coate, S. and G. Loury (1993) “Will Affirmative Action Policies Eliminate Negative Stereotypes?”
C, 83 (5), pp. 1220-1240
Devine, T.J., and Kiefer, N.M. (1991). Empirical Labor Economics (New York: Oxford University
Press).
Farber, H., and R. Gibbons, (1996) “Learning and Wage Dynamics” Quarterly Journal of
Economics, pp. 1007-47.
Frazis, H., (1993), “Selection Bias and the Degree Effect,” Journal of Human Resources 28 (3):
(Summer 1993): 538-554.
Gibbons, R., and L. Katz (1991) “Layoffs and Lemons” Journal of Labor Economics 9: 351-80.
Greenwald, B., (1986) “Adverse Selection in the Labor Market” Review of Economic Studies 53:
325-47



39

Foster, A. D. and M. R. Rosenzweig (1993), “Information, Learning, and Wage Rates in Low
Income Rural Areas,” Journal of Human Resources.
Holzer, H., (1988) “Search Methods Use by Unemployed Youth” Journal of Labor Economics 6:
1-20.
Jovanovic, B., (1979) “Job Matching and the Theory of Turnover” Journal of Political Economy 87:
972-90.
Katz, Lawrence F. and Kevin M. Murphy. “Changes in Relative Wages, 1963-1987: Supply and
Demand Factors." Quarterly Journal of Economics. Vol 107:1, pp3 5-78.
Lang K., (1986) “A Language Theory of Discrimination”, Quarterly Journal of Economics. 101
(May): 363-82
Lazear, E., (1986) “Raids and Offer Matching,” Research in Labor Economics 8 (part A): 141-165
Lundberg, S., and R. Startz, “Private Discrimination and Social Intervention in Competitive Labor
Markets,” American Economic Review. 73 (June): 340-347.
L. Lynch, (1992) “Private Sector Training and the Earnings of Young Workers,” American
Economic Review 82 (March):299-312
Medoff, J. and K. Abraham. (1980), “Experience, Performance and Earnings.” Quarterly Journal
of Economics. December, pp. 703-736.
Montgomery, J. (1991). “Social Networks and Labor Market Outcomes: Toward and Economic
Analysis”. American Economic Review 81: 1408-18.
Mumane, Richard J., John B. Willett and Frank Levy, (1995) “The Growing Importance of
Cognitive Skills in Wage Determination “, Review of Economics and Statistics, vol. lxxvii, no.
2, (May): 251-266.
Murphy, K. and F. Welch, (1992) “The Structure of Wages”, Quarterly Journal of Economics. 107
(February): 285-326.
Neal, Derek A. and William R. Johnson 1996 “The Role of Premarket Factors in black-White Wage
Differences.” Journal of Political Economy 104 (5), 869-895.
Oettinger, G. S. (1996), “Statistical Discrimination and the Early Career Evolution of the BlackWhite Wage Gap”. Journal of Labor Economics .
Parsons, D. O., (1986) “The Employment Relationship: Job Attachment, Worker Effort, and the
Nature of Contracts” Ashenfelter and Layard, eds. Handbook of Labor Economics
Parsons, D. O., “Reputational Bonding of Job Performance: The Wage Consequences of Being
Fired” unpublished paper. (July 1993)



40
Riley, J. G. (1979) “Testing the Educational Screening Hypothesis,” Journal of Political Economy.
October, pp. S227-52.
Taber, C. R., (1996) “The Rising College Premium in the Eighties: Return to College or Return to
Ability,” unpublished paper, Northwestern University (April).
Spletzer, J. and M. Lowenstein (1996), “Belated Training: The Relationship Between Training,
Tenure and Wages” unpublished paper, Bureau of Labor Statistics.
Waldman, M. (1984), “Job Assignment, Signaling, and Efficiency,” Rand Journal of Economics. 15:
255-67.
Weiss, A. (1995), “Human Capital and Sorting Models,” Journal of Economic Perspectives.
White, H. (1984). Asymptotic Theory for Econometricians, Orlando, FL: Academic Press




42
Table 1: The Effects of Standardized AFQT, Father's Education, and Schooling on Wages
Dependent Variable: Log Wage. Experience Measure: Potential Experience.

OLS estimates (standard errors)
Model:

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(a) Education

0.0946
(0.0034)

0.0742
(0.0039)

0.0729
(0.0040)

0.1001
(0.0051)

0.0798
(0.0054)

0.0984
(0.0057)

0.0818
(0.0054)

0.0949
(0.0057)

0.0788
(0.0064)

0.0855
(0.0064)

(b) Black

-0.1801
(0.0117)

-0.1039
(0.0138)

-0.0974
(0.0141)

-0.1799
(0.0117)

-0.0969
(0.0141)

-0.0956
(0.0142)

0.0153
(0.0203)

-0.0330
(0.0226)

-0.0948
(0.0142)

-0.0945
(0.0141)

0.0807
(0.0077)

0.0783
(0.0078)

0.0785
(0.0078)

0.0179
(0.0107)

0.0790
(0.0077)

0.0328
(0.0115)

0.0187
(0.0107)

-0.0028
(0.0111)

0.0263
(0.0192)

0.0259
(0.0192)

-0.0015
(0.0031)

0.0028
(0.0019)

-0.0062
(0.0308)

-0.0163
(0.0308)

-0.0286
(0.0318)

-0.0098
(0.0061)

-0.0351
(0.0069)

-0.0122
(0.0061)

-0.0301
(0.0071)

-0.0510
(0.0087)

-0.0361
(0.0105)

(c) Standardized
AFQT
(d) Father's
Education/10
(e) Education *
Experience/10

-0.0077
(0.0062)

AFQT *
Experience/10

0.0820
(0.0125)

0.0622
(0.0143)

0.0817
(0.0125)

0.0316
(0.0241)

(g) Father'sEd *
Experience/100

0.0592
(0.0372)

0.0481
(0.0372)

0.0611
(0.0371)

0.0392
(0.0667)

(f)

(h) Black *
Experience/10

-0.1456
(0.0216)

-0.0816
(0.0262)

Note: All equations controlfora quadratictime trend, uiban residence, and dummy variablestocontrol forwhether Father’seducation ismissing and whether AFQT is
missing, and interactionsbetween these dummy variablesand experience when Experience interactionsare included. Column 9 includes the interactionbetween
education and time/10 (theestimate is.0349 (.0078)). Column 10 includes interactionsofeducation (.0142(.0101)), AFQT (.0688(.0228)), and Father’sEducation/10
(.0317(.0631))with time/10. Standard errorsare White/Huber standard errorscomputed accounting forthe factthatthere are multiple observations foreach worker.
The sample size is27704 observations from 4042 individuals.




43
Table 2: The Effects of Standardized AFQT, Father's Education, and Schooling on Wages
Dependent Variable: Log Wage. Experience Measure: Actual Experience.

OLS estimates (standard errors)
Model:

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(a) Education

0.0805
(0.0026)

0.0613
(0.0031)

0.0599
(0.0032)

0.0713
(0.0033)

0.0504
(0.0038)

0.0628
(0.0040)

0.0524
(0.0038)

0.0614
(0.0040)

0.0385
(0.0059)

0.0485
(0.0064)

(b) Black

-0.1378
(0.0113)

-0.0673
(0.0132)

-0.0624
(0.0134)

-0.1381
(0.0113)

-0.0625
(0.0134)

-0.0622
(0.0135)

-0.0025
(0.0152)

-0.0346
(0.0159)

-0.0608
(0.0135)

-0.0602
(0.0135)

0.0754
(0.0073)

0.0730
(0.0075)

0.0731
(0.0075)

0.0366
(0.0082)

0.0726
(0.0075)

0.0430
(0.0084)

0.0373
(0.0082)

0.0041
(0.0120)

0.0324
(0.0186)

0.0321
(0,0187)

0.0005
(0.0022)

0.0032
(0.0019)

0.0085
(0.0224)

0.0042
(0.0223)

-0.0127
(0.0345)

0.0200
(0.0054)

-0.0055
(0.0066)

0.0163
(0.0054)

-0.0025
(0.0068)

-0.0320
(0.0099)

-0.0165
(0.0114)

(c)

Standardized
AFQT

(d) Father's
Education/10

(e)

Education *
Experience/10

(f)

AFQT*
Experience/10

0.0750
(0.0131)

0.0614
(0.0148)

0.0737
(0.0131)

0.0226
(0.0240)

(g) Father's Ed *
Experience/100

0.0587
(0.0367)

0.0502
(0.0370)

0.0613
(0.0365)

0.0362
(0.0678)

(h) Black *
Experience/10

0.0195
(0.0055)

-0.1267
(0.0233)

-0.0583
(0.0280)

Note: All equations control fora quadratictime trend,urban residence, and dummy variablestocontrol forwhether Father'seducation ismissing and whether AFQT is
missing, and interactionsbetween thesedummy variablesand experience when Experience interactionsare included. Column 9 includes the interactionbetween
educationand time/10 (theestimate is.0402 (.0085)). Column 10 includes interactionsofeducation (.0195(.0104)), AFQT (,0684(.0211)),and Father’sEducation/10
(,0333(.0623)) with time/10. Standard errorsare White/Hubcr standarderrorscomputed accounting forthe factthatthereare multiple observations foreach worker.
The sample size is27704 observationsfrom 4042 individuals.




44
Table 3: IV Estimates of the Effects of Standardized AFQT, Father's Education, and Schooling on Wages
Dependent Variable: Log Wage. Experience Measure: Actual Experience with Potential Experience as Instruments .

IV estimates (standard errors)
Model:

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(a) Education

0.0813
(0.0028)

0.0620
. (0.0034)

0.0606
(0.0035)

0.0891
(0.0050)

0.0692
(0.0054)

0.0879
(0.0056)

0.0726
(0.0054)

0.0843
(0.0057)

0.0468
(0.0065)

0.0517
(0.0069)

(b) Black

-0.1368
(0.0116)

-0.0650
(0.0132)

-0.0601
(0.0135)

-0.1367
(0.0116)

-0.0600
(0.0136)

-0.0593
(0.0136)

0.0495
(0.0186)

0.0054
(0.0205)

-0.0531
(0.0138)

-0.0527
(0.0138)

0.0762
(0.0074)

0.0737
(0.0075)

0.0738
(0.0076)

0.0177
(0.0096)

0.0728
(0.0075)

0.0332
(0.0102)

0.0218
(0.0096)

0.0005
(0.0127)

0.0337
(0.0188)

0.0340
(0.0188)

0.0000
(0.0028)

0.0033
(0.0019)

0.0091
(0.0282)

-0.0043
(0.0278)

0.0111
(0.0363)

-0.0181
(0.0087)

-0.0561
(0.0100)

-0.0242
(0.0087)

-0.0483
(0.0101)

-0.1220
(0.0188)

-0.1090
(0.0221)

(c) Standardized
AFQT
(d) Father's
Education/10
(e) Education *
Experience/10

-0.0165
(0.0088)

AFQT*
Experience/10

0.1148
(0.0164)

0.0819
(0.0188)

0.1056
(0.0163)

0.0539
(0.0399)

(g) Father'sEd *
Experience/100

0.0744
(0.0480)

0.0531
(0.0484)

0.0877
(0.0478)

0.1219
(0.1124)

(f)

(h) Black*
Experience/10

-0.2305
(0.0318)

-0.1364
(0.0387)

Note: All equations controlfora quadratictime trend, urban residence, and dummy variablestocontrol forwhether Father's education ismissing and whether AFQT is
missing, and interactionsbetween thesedummy variables and experience when Experience interactionsare included. The instrumental variables arethecorresponding
terms involving potentialexperience and theothervariables inthe model. Column 9 includes the interactionbetween education and time/10 (theestimate is 0.0803
(.0135)). Column 10 includes interactionsofeducation (.0670(.0166)), AFQT (.0546(.0311)), and Father’sEducation/10 (-.0376(.0882)) with time/10. Standard errors
are White/Huber standard errorscomputed accounting forthefactthatthere aremultiple observationsforeach worker. The sample size is27704 observations from
4042 individuals.




45

Table 4: Estimates of the Effects of Standardized AFQT, Father's Education, and Schooling on Wages
Controlling for 2-digit Occupation Codes of Initial Job
Dependent Variable: Log Wage. Experience Measure: Potential Experience.
OLS estimates (standard errors)
Model:

a)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(a) Education

0.0759
(0.0042)

0.0611
(0.0045)

0.0596
(0.0046)

0.0717
(0.0058)

0.0572
(0.0061)

0.0767
(0.0064)

0.0592
(0.0061)

0.0745
(0.0064)

0.0666
(0.0067)

0.0709
(0.0068)

(b) Black

-0.1539
(0.0131)

-0.0917
(0.0154)

-0.0829
(0.0156)

-0.1539
(0.0131)

-0.0830
(0.0156)

-0.0812
(0.0156)

0.0190
(0.0209)

-0.0413
(0.0222)

-0.0809
(0.0157)

-0.0809
(0.0156)

0.0662
(0.0085)

0.0635
(0.0086)

0.0634
(0.0086)

-0.0036
(0.0111)

0.0638
(0.0086)

0.0061
(0.0117)

-0.0030
(0.0111)

-0.0151
(0.0119)

0.0298
(0.0206)

0.0299
(0.0207)

-0.0049
(0.0315)

0.0310
(0.0206)

0.0010
(0.0316)

-0:0051
(0.0316)

-0.0188
(0.0342)

0.0032
(0.0065)

-0.0245
(0.0075)

0.0008
(0.0065)

-0.0212
(0.0078)

-0.0354
(0.0096)

-0.0254
(0.0119)

AFQT*
Experience/10

0.0940
(0.0140)

■

0.0807
(0.0159)

0.0940
(0.0140)

0.0626
(0.0280)

(g) Father'sEd *
Experience/100

0.0532
(0.0411)

0.0447
(0.0414)

0.0530
(0.0411)

0.0229
(0.0752)

(c) Standardized
AFQT
(d) Father's
Education/10
(e) Education *
Experience/10
(f)

(h) Black *
Experience/10

0.0057
(0.0066)

-0.1377
(0.0233)

-0.0542
(0.0281)

Note: All equations controlfora quadratictime trend,urban residence, and dummy variablestocontrol forwhetherFather'seducation ismissing and whether AFQT is
missing, and interactionsbetween thesedummy variablesand experience when Experience interactionsare included. Column 9 includes the interactionbetween
education and time/10 (theestimate is.0206 (.0083)). Column 10 includes interactionsofeducation (.0072(.0111)), AFQT (,0409(.0263)), and Father’sEducation/10
(,0405(.0697)) with time/10. Standard errorsare White/Huber standard errorscomputed accounting forthe factthatthereare multiple observations foreach worker.
The Sample Size is22271 observationsfrom 3187 individuals.




46
Table 5: OLS Estimates of the Effects of Sibling Wage and Schooling on Wages
Dependent Variable: Log Wage. Experience Measure: Potential Experience

OLS estimates (standard errors)
Model:

(1)

(a)

Education

0.0936
(0.0055)

0.0830
(0.0055)

0.1032
(0.0077)

0.0900
(0.0077)

0.0938
(0.0078)

0.0803
(0.0089)

0.0805
(0.0089)

(b)

Black

-0.1932
(0.0164)

-0.1620
(0.0163)

-0.1932
(0.0164)

-0.1621
(0.0163)

-0.1620
(0.0163)

-0.1619
(0.0163)

-0.1619
(0.0163)

(c)

Log Wage of Oldest
Non-Missing Sibling

0.1876
(0.0191)

0.1873
(0.0191)

0.1266
(0.0276)

0.1264
(0.0276)

0.1230
(0.0323)

(d)

Sibling is Female

0.0205
(0.0155)

0.0208
(0.0155)

0.0211
(0.0155)

0.0214
(0.0155)

0.0213
(0.0155)

(e)

Education *
Experience/10

-0.0097
(0.0089)

-0.0146
(0.0090)

-0.0220
(0.0113)

-0.0216
(0.0120)

(f)

Log of Sibling Wage *
Experience/10

0.0860
(0.0327)

0.0862
(0.0326)

0.0802
(0.0679)

(2)

(3)

-0.0133
(0.0091)

(4)

(5)

(6)

(7)

Note: All equations controlfora quadratictime trend,and urban residence. Column 6 includesthe interactionbetween education and time/10 (theestimateis
(,0204(.0113)). Column 7 includes interactionsofeducation (,0199(.0122)), and Log SiblingWage (,0085(.0667)) with time/10. Standard errorsare
White/Huber standarderrorscomputed accounting forthe factthatthereare multiple observationsforeach worker. The Sample Size is 13,555 observationsfrom
1881 individuals.




47
Table 6: The Effectsof Standardized AFQT, and Schooling on Wages Over Time.
Derivatives at Selected Experience Levels
Dependent Variable: Log Wage
A) PotentialExperience.
Years of
Experience

9wt
9AFQT

9w2
SAFQT.dt

9wt
9s

9wt
2
9s,dt

0

0.0197
(0.0235)

0.0025
(0.0139)

0.0786
(0.0092)

0.0053
(0.0040)

1

0.0235
(0.0275)

0.0049
(0.0144)

0.0830
(0.0101)

0.0034
(0.0040)

3

0.0370
(0.0347)

0.0084
(0.0155)

0.0865
(0.0116)

0.0002
(0.0042)

5

0.0560
(0.0415)

0.0104
(0.0166)

0.0843
(0.0131)

-0.0023
(0.0043)

8

0.0881
(0.0512)

0.0104
(0.0181)

0.0731
(0.0152)

-0.0048
(0.0046)

12

0.1206
(0.0640)

0.0048
(0.0201)

0.0513
(0.0179)

-0.0056
(0.0048)

B) Actual Experience Instrumented with PotentialExperience.
Years of
Experience

dwt

9wJ

5AFQT

SAFQT.a

9wt
9s

9w2
9s,9t

0

0.0183
(0.0205)

0.0092
(0.0231)

0.0881
(0.0105)

-0.0024
(0.0086)

1

0.0278
(0.0316)

0.0099
(0.0251)

0.0843
(0.0137)

-0.0051
(0.0089)

3

0.0496
(0.0496)

0.0120
(0.0288)

0.0711
(0.0190)

-0.0074
(0.0096)

5

0.0755
(0.0658)

0.0138
(0.0323)

0.0566
(0.0236)

-0.0068
(0.0102)

8

0.1172
(0.0893)

0.0131
(0.0373)

0.0406
(0.0300)

-0.0037
(0.0112)

12

0.1475
(0.1206)

-0.0012
(0.0436)

0.0299
(0.0381)

-0.0032
(0.0123)

The equations the same variables as the equation in column (6) oftable 1except the interactionsbetween
education and experience and between AFQT and experience involvefourth-orderpolynomials in experience. In
panel B, the instrumental variables are the corresponding terms involving potential experience and the other
variables in the model.




48

Table7: Estimates of the Effects of AFQT, Father's Education, and Schooling on Wage Growth
Dependent Variable: A log Wage.
Coefficient Estimates (standard errors)
OLS, potential experience
Variable

IV, actual experience treated as endogenous

0)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

0.0148
(0.0094)

-0.0092
(0.0110)

-0.0080
(0.0113)

0.0126
(0.0094)

0.0295
(0.0079)

-0.0030
(0.0100)

0.0021
(0.0101)

0.0256
(0.0080)

AFQT*
AExperience

0.0646
(0.0192)

0.0595
(0.0210)

0.0905
(0.0197)

0.0700
(0.0213)

Father's Education *
AExperience

0.0809
(0.0557)

0.0776
(0.0563)

0.0952
(0.0561)

0.0818
(0.0567)

Education *
AExperience

Black *
AExperience
S.E.E

.29655

.29650

-0.0213
(0.0409)

-0.0995
(0.0351)

.29650

.29653

.29600

.29589

-0.0850
(0.0420)

-0.1768
(0.0372)

.29588

.29590

Note: All equationscontrol fora the change ina quadratictime trend, change in urban residence, and dummy variables tocontrol forwhether father’seducation
ismissing and whether AFQT ismissing, and interactionsbetween these dummy variablesand thechange in experience when change in experience interactions
are included. The instrumentalvariables are thecorresponding terms involving potential experience and theothervariables inthe model. Standard errorsare
White/Huber standard errorscomputed accountingforthe factthatthereare multiple observations foreach worker. The sample sizeis 19393 observationsfrom
3580 individuals.




48

49
Table 8: The Effects of Standardized AFQT, Schooling, and Training on Wages
Dependent Variable: Log Wage ; Experience Measure: Potential Experience

Training Measure: Predicted before 88, Actual After
OLS estimates (standard errors)
Model:

(1)

(2)

(3)

(4)

(5)

(6)

(a)

Education

0.0808
(0.0054)

0.0856
(0.0055)

0.0951
(0.0057)

0.0830
(0.0054)

0.0869
(0.0055)

0.0921
(0.0058)

(b)

Black

-0.1008
(0.0142)

-0.0920
(0.0143)

-0.0916
(0.0143)

0.0117
(0.0206)

-0.0131
(0.0203)

-0.0332
(0.0221)

(c)

Standardized
AFQT

0.0822
(0.0078)

0.0572
(0.0079)

0.0218
(0.0104)

0.0828
(0.0078)

0.0582
(0.0078)

0.0376
(0.0114)

(e)

Education *
Experience/10

-0.0102
(0.0062)

-0.0346
(0.0066)

-0.0472
(0.0073)

-0.0129
(0.0062)

-0.0358
(0.0066)

-0.0427
(0.0075)

(f)

AFQT*
Experience/10

(8)

Black *
Experience/10

(h)

Training: Tt

-0.1044
(0.0179)

(i)

Cumulative
Training: ET,

0.1864
(0.0114)

0.0502
(0.0125)

0.0288
(0.0149)
-0.1467
(0.0221)

-0.1048
(0.0222)

-0.0777
(0.0266)

-0.0936
(0.0180)

-0.0974
(0.0179)

-0.0930
(0.0180)

0.1781
(0.0116)

0.1810
(0.0114)

0.1776
(0.0116)

Note: All equations control fora quadratictime trend, urban residence, a cubic inpotentialexperience. In thistable,Ttand Ttare thepredicted probability
oftraining inyear tifbefore 1987 and actualtrainingifyear tisafter 1987. Predictions arebased on a probit model containing: years ofschooling,
potentialexperience, Black, AFQTPCT, schooling time potentialexperience and potentialexperience squared, AFQT times potential experience and
potential experience squared, and theproduct ofAFQTPCT, schooling, and potentialexperience. Standard errorsare White/Huber standard errorscomputed
accountingforthefactthatthere are multipleobservationsforeach worker. The sample sizeis25115 observations from 3768 individuals.




49

50

Table 9: Estimates of the Effects of AFQT, Father's Education, and Schooling on Wage Growth
with Controls for Training
Dependent Variable: A log Wage. Experience Measure: Potential Experience
Coefficient Estimates (standard errors)
Variable
Education *
AExperience/10

(1)

(2)

(3)

0.0126
(0.0094)

-0.0080
(0.0113)

0.0073
(0.0096)

(4)
-0.0108
(0.0113)

AFQT*
AExperience/10

0.0595
(0.0210)

0.0533
(0.0211)

Father's Education *
AExperience/10

0.0078
(0.0056)

0.0075
(0.0056)
-0.0923
(0.0353)

-0.0215
(0.0408)

Lagged Training lagged T / 10

-0.0109
(0.0950)

-0.0336
(0.0951)

Training: T / 10

0.2622
(0.0891)

0.2446
(0.0894)

.29649

.29647

Black *
AExperience/10

SE E

-0.0995
(0.0351)

-0.0213
(0.0409)

.29653

.29650

Note: All equations controlfora thechange in a quadratictime trend, change inurban residence, and dummy variablestocontrol forwhether father'seducation
ismissing and whether AFQT ismissing, and interactionsbetween these dummy variablesand the change inexperience when change in experience interactions
areincluded. Standard errorsareWhite/Huber standard errorscomputed accountingforthefactthatthere are multiple observationsforeach worker. The
sample size is 19393 observationsfrom 3580 individuals.




50

Table 10: The Effects of Potential Experience, Standardized AFQT, Fathers Education, and Schooling on the
Probability of Employer-Initiated Separation
Linear Probability Models
Dependent Variable: Employer-Initiated Separation.
OLS estimates (standard errors)
Model:

(1)

(a)

Potential Experience / 10

-0.0302
(0.0194)

-0.1646
(0.0540)

(b)

Potential Experience
Squared /100

-0.0143
(0.0113)

0.0148
(0.0129)

(c)

Tenure

-0.0141
(0.0006)

-0.0146
(0.0006)

(d)

Education

-0.0153
(0.0012)

-0.0206
(0.0027)

(e)

Black

0.0272
(0.0057)

0.0265
(0.0057)

(f)

Standardized
AFQT

-0.0108
(0.0029)

-0.0251
(0.0061)

(g)

Father's
Education / 100

0.0303
(0.0701)

0.0991
(0.1532)

(h)

Education *
Experience /10

0.0083
(0.0033)

(0

AFQT*
Experience /10

0.0188
(0.0065)

(i)

Father'sEd *
Experience /1000

-0.0910
(0.1738)

(2)

Note: An Employer-Initiated Separation includes separationsbecause oflayoffs,firings, and plant closings. All equations
control forurban residence, and dummy variables to controlforwhether Father's education ismissing and whether AFQT is
missing, and interactionsbetween these dummy variablesand experience when Experience interactionsare included. Standard
errors are White/Huber standard errorscomputed accounting forthe factthatthere are multiple observations foreach worker.
The sample sizeis27443 observations from 4034 individuals.




'i 2

Table Al: Descriptive Statistics
Variable

Mean

Standard
Deviation

Minimum

Maximum

Real Hourly Wage

8.370

4.766

2.01

96.46

Log of Real Hourly Wage
(w)

2.005

0.474

0.7

4.57

Potential Experience (t)

7.349

3.665

0

21

Actual Experience (t)

4.925

3.424

0

18.26

12.699

2.136

8

18

Black dummy (Black)

0.290

0.454

0

1

Dummy for not knowing
AFQT Score

0.038

0.191

0

1

-0.133

1.022

-2.780

1.922

0.119

0.324

0

1

11.709

3.112

4

20

0.781

0.413

0

1

86.623

81.558

79

92

Training (Tt)

0.096

0.200

0

1

Cumulative Training: (Z Tx)

0.462

0.549

0

5.592

Education (s)

Standardized AFQT Score
(AFQT)
Dummy for not knowing
Father's Education
Father's Education (F ED)
Dummy for Urban Dweller
Year

Sample size = 27,704 observations except for the training measures where it is 25,115 observations.




r

S3

Table A2: Relationships Among Wages, Schooling, AFQT, and Parental Education Simple Regression
Coefficients (standard error) and [Correlation coefficient]
Dependent Variable
Right Hand
Side Variable

Log Wage

Highest
Grade

Father’s Standard.
Education AFQT
0.6197
(0.0098)
[0.4029]

Weeks of
Company
Training

Layoff

Quit

0.2747
(0. 0027)
[0.5829]

0.1189
(0.0163)
[0.0514]

-0.0193
(0.0010)
[-0.1259]

-0.0128
(0.0014)
[-0.0589]

-0.0823
(0.0103)
[-0.0329]

-0.4831
(0.0106)
[-0.2923]

0.1341
(0.0019)
[0.4362]

0.0621
(0.0106)
[0.0392]

-0.0059
(0.0007)
[-0.0542]

0.0014
(0.0009)
[0.0112]

-0.0323
(0.0067)
[-0.0331]

-0.1660
(0.0071)
[-0.1538]

0.3072
(0.0345)
[0.0645]

-0.0377
(0.0021)
[-0.1174]

-0.0121
(0.0029)
[-0.0306]

-0.1036
(0.0218)
[-0.0142]

-0.8138
(0.0227)
[-0.2329]

-0.0011
(0.0004)
[-0.0130]

-0.0017
(0.0006)
[-0.0190]

-0.0173
(0.0044)
[-0.0282]

-0.0297
(0.0047)
[-0.0453]

-0.2707
(0.0092)
[-0.2080]

-1.1766
(0.0696)
[-0.1391]

-0.3223
(0.0753)
[-0.0629]

-1.2747
(0.0510)
[-0.1658]

-0.8834
(0.0553)
[-0.1070]

Highest
Grade

0.0785
(0.0014)
[0.3615]

Father’s
Education

0.0298
(0.0010)
[0.2092]

0.2592
(0.0041)
[0.4029]

Standardized
AFQT

0.1565
(0.0031)
[0.3567]

1.2245
(0.0119)
[0.5829]

1.4280
(0.0204)
[0.4362]

Weeks of
Company
Training

0.0045
(0.0007)
[0.0429]

0.0214
(0.0029)
[0.0514]

0.0268
(0.0046)
[0.0392]

0.0124
(0.0014)
[0.0645]

Layoff

-0.1659
(0.0104)
[-0.1094]

-0.8921
(0.0468)
[-0.1259]

-0.6558
(0.0728)
[-0.0542]

-0.3904
(0.0222)
[-0.1174]

-0.2702
(0.1112)
[-0.0130]

Quit

-0.2145
(0.0076)
[-0.1909]

-0.3232
(0.0348)
[-0.0589]

0.0814
(0.0539)
[0.0112]

-0.0683
(0.0165)
[-0.0306]

-0.2321
(0.0821)
[-0.0190]

-0.1478
(0.0050)
[-0.2080]

Actual
Experience

0.0444
(0.0010)
[0.2893]

-0.0374
(0.0047)
[-0.0329]

-0.0350
(0.0072)
[-0.0331]

-0.0106
(0.0022)
[-0.0142]

-0.0436
(0.0110)
[-0.0282]

-0.0116
(0.0007)
[-0.1391]

-0.0230
(0.0009)
[-0.1658]

Potential
Experience

0.0174
(0.0010)
[0.1044]

-0.1899
(0.0042)
[-0.2923]

-0.1561
-0.0718
(0.0066) (0.0020)
[-0.1538] [-0. 2329]

-0.0647
(0.0103)
[-0.0453]

-0.0027
(0.0006)
[-0.0629]

-0.0138
(0.0009)
[-0.1070]




Potential
Actual
Experience Experience

0.8605
(0.0045)
[0.7953]
0.7452
(0.0039)
[0.7953]

54

Appendix 1

„
^
, .* A .v a r(v )+ co v (v ,e )
From equation (11) we have
= -cov ( s ,z ) * -- ;
---:— --- and
\var(s,z)\

= _ v a i(s ) ,

We ^

\ v a r ( y ) + c o v ( v ,e)

,ha,

.
“

\var(s,z)\

c o v (s ,z )

. This gives us the desired result:

v a r(s)

O,s = -<&zs<i>z
Appendix 2: Derivation of Equation (16) and (17).

Consider equation ( 15). Rewriting varfoz)'1as a partitioned matrix leads to
var(s ,z )

-1

var(s)

cov (s ,z )

_cov (z ,s )

var(z)_

1 =

where var(s,z) isthe (K+l)x(K+l) variance matrix.
Using the partitioned inverse formula and ignoring the first column (since itwill be multiplied by 0), we
have:
(15a> E

‘b*' _
-b*.

bso

-cov(s,z)*G
+

*[cov(z,E(Av + e|D t))]
var(s)*G_

_bzo_

-l-i

where G = [varfi)*va.r(z)~ co v (z , s) *cov(i,zj]

Now, consider the diagonal matrix K which has elements of cov(z, Av+e) along the diagonal. K'1is
also diagonal. Thus (15a) may be rewritten as:
(15b) E

bst _

bso

bn.

bzo_

cov ( s ,z ) * G
+

‘K

* K ~ l * [cov(z,E(v+ e|D t
)j]

var(s)*G_

Manipulating this fiuther gives us:

(15c) E

b« _
_b*.

bso
jbzo_

-cov ( s ,z ) * G
+

*0' *[cov(z,Av+e)]
var(i)*G

where 0 ‘is s diagonal matrix with element kk equal to 0J* = c-v-^Zk’^ v
cov(Zk,v+ e)
of

Qa

and evaluating the above equation leads to (16) and (17) in the text.




54

Using the definition

55
Appendix 3

The regression parameters Z>*(and

where

are

isthe coefficient on zi inthe regression of Av+e on s and ziand d>2 isthe coefficient on z x

inthe regression of Av+e on s,z\, and Z2. By the omitted variables formula, we know that
<1>* = <f>2[+ <J>Zj<I>Vi where <E>Zjisthe coefficient on z2 inthe regression of Av+e on s, z, and z2 and
isthe coefficient on zi inthe regression ofz2 on zi and s. Therefore,

a

= <!>

- ^ = < d;

aot

a

and

ae,

09,

a

Taking the difference establishes that
_

a

a

ad,

a

where O rj isthe coefficient on z2 in the regression of Av + e on s, z, and z2and
on zi inthe regression of z2 on zi and s.




55

isthe coefficient
Full text of Working Papers (Federal Reserve Bank of Chicago) : Employer Learning and Statistical Discrimination, Working Paper 1997-11

FRASER