The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
Em ployer Learning and Statistical Discrim ination Joseph G. Altonji, Charles R. Pierret Working Papers Series Macroeconomic Issues Research Department Federal Reserve Bank of Chicago December 1997 (WP-97-11) FEDERAL RESERVE B A N K OF CHICAGO First Draft: July 1995 This version: October 1997 Employer Learning and Statistical Discrimination Joseph G. Altonji Northwestern University and Federal Reserve Bank of Chicago Charles R. Pierret Bureau ofLabor Statistics This research was supported by the Institute for Policy Research, Northwestern University, the Bureau of Labor Statistics, U.S. Department of Labor, and the National Science Foundation. W e owe a special debt to Nachum Sicherman for assisting us with the NLSY data. W e thank Paul Devereux, Judith Hellerstein, Derek Neal, Bruce Weinberg, and participants in seminars at BLS, Berkeley, Boston University, Columbia, The Federal Reserve Bank of Chicago, Indiana University, University ofMaryland, McMaster, NBER, Northwestern, the University College London, the Upjohn Institute and the U. of Western Ontario for helpful comments. W e are responsible for allerrors and omissions. The opinions stated in this paper do not necessarily represent the official position or policy ofthe U.S. Department ofLabor, the Federal Reserve Bank of Chicago, or the Federal Reserve System. Employer Learning and Statistical Discrimination Abstract We provide a test for statistical discrimination or “rational” stereotyping in environments in which agents learn over time. Our application is to the labor market. If profit maximizing firms have limited information about the general productivity of new workers, they may choose to use easily observable characteristics such as years of education to "statistically discriminate" among workers. As firms acquire more information about a worker, pay will become more dependent on actual productivity and less dependent on easily observable characteristics or credentials that predict productivity. Consider a wage equation that contains both the interaction between experience and a hard to observe variable that is positively related to productivity and the interaction between experience and a variable that firms can easily observe, such as years of education. We show that the wage coefficient on the unobservable productivity variable should rise with time in the labor market and the wage coefficient on education should fall. We investigate this proposition using panel data on education, the AFQT test, father's education, and wages for young men and their siblings from NLSY. We also examine the empirical implications of statistical discrimination on the basis of race. Our results support the hypothesis of statistical discrimination, although they are inconsistent with the hypothesis that firms fully utilize the information in race. Our analysis has wide implications for the analysis of the determinants of wage growth and productivity and the analysis of statistical discrimination in the labor market and elsewhere. JEL Classification: D83, J31 Joseph G. Altonji Department of Economics Northwestern University Evanston, IL 60208 (847) 492-8218 altonji@nwu.edu Charles Pierret Bureau of Labor Statistics 2 Massachusetts Ave. NE Suite 4945 Washington, D.C. 20212 (202) 606-7519 pierret_c@bls.gov 1 1. Introduction People go through lifemaking an endless stream ofjudgments on the basis oflimited information about matters as diverse as the safety of a street, the quality of a car, the suitability of a potential spouse, and the skilland integrity ofa politician. When hiring, employers must assess the value ofpotential workers with only the information contained inresumes, recommendations, and personal interviews. What do employers know about the productivity ofyoung workers, and how quickly do they learn? Given lack ofinformation about actual productivity, do employers "statisticallydiscriminate" among young workers on the basis of easily observable variables such as education, race, and other clues to a worker's labor force preparation. Many issues in labor economics hinge on the answers, including the empirical relevance ofthe signaling model of education (Weiss (1995), statisticaltheories of discrimination (Aigner and Cain (1977), Lundberg and Startz (1983)), and the interpretation of earnings dynamics. The desirability of changes inthe laws governing hiring procedures, evaluation ofemployees, layoffand firing costs, and the provision ofreferences for former employees also hinge on the answers. Although labor economists typically assume wages are strongly influenced by employer beliefs about worker productivity, there islittleempirical research on how much employers know about theirworkers, or about how this information changes with time inthe labor market.1 In this paper we explore the implications of a hypothesis that we referto as Statistical Discrimination with Employer Learning, or SD-EL. Our working hypothesis isthat firms have only limited information about the quality ofworkers inthe early stages oftheir careers. They distinguish among workers on the basis ofeasily observable variables that are correlated with productivity such as years ofeducation or degree, the quality ofthe school the person attended, race, and gender. (To avoid misunderstanding we wish to stress that part ofthe relationship between wages and race and gender may reflect biased inferences on the part ofemployers or other forms ofdiscrimination that have nothing to do with productivity or information.) Firms weigh this information with other information about outside activities, work experience to date, references, thejob interview, and perhaps formal testing by the firm. Each period, the firm observes noisy indicators ofthe worker’s performance. Over time, these make the information observed at the startredundant. Wages become more closely tied to actual productivity and less strongly dependent upon the information that was readily available at the beginning ofa worker's career. The main contribution ofthe paper isto provide a way to test for whether firms 'Thereisa largeempiricaland theoreticalliteratureon labormarket searchandon theeffectsoflearningaboutthe qualityofthejob match on wages and mobility. SeeDevineand Kiefer(1991)foracomprehensive survey. 2 statisticallydiscriminate on the basis ofreadily available information such as education and race. W e also provide a way to estimate the learning profile of firms and address the issue ofwhether firms have a stable view ofthe productivity ofworkers with many years oflabor market experience. Our research builds on some previous work, particularly Farber and Gibbons (1996).2 Farber and Gibbons investigate three implications of employer learning. Imagine a variable s (say schooling) which firms can observe directly and a second variable, z (say AFQT test scores or sibling'swage rate) which firms cannot observe directly. They show firstthat employer learning does not imply that the coefficient on s in a wage regression will change with experience. This isbecause future observations, on average, simply validate the relationship between expected productivity and s for new entrants. Their empirical evidence isgenerally supportive ofthisresult, although they note that a positive interaction could arise ifschooling iscomplementary with training. (Positive interactions are found in a number of data sets, including the PSID.) Second, they show that the part ofz that isorthogonal to information available to employers at the beginning of a worker's careers will have an increasingly large association with wages as time passes. Third, they note that wage growth willbe a Martingale process, at least in the case in which productivity ofthe worker isconstant. In this paper we focus on a different but related proposition that allows us to examine the issue of statistical discrimination. The proposition concerns how controlling for the experience profile ofthe effect ofz on wages altersthe interaction between experience and s. W e show that not only should the coefficient on z risewith time in the labor market, but the coefficient on s should fall. W e investigate these propositions using data on young men from the NLSY. W e also explore the implications of statistical discrimination on the basis ofrace, which isalso easily observable to employers and is 2OtherrelevantreferencesareGibbonsand Katz (1991)whichwe discussbelowandParsons (1993). Glaezer(1992) usesvariancesinwage innovationsasa measureoflearning. Hiswork issomewhatcloselyrelatedtoFarberand Gibbons. However, heattemptstodistinguishbetweeninformationthatisspecifictothejobmatchand informationabout generalproductivity. Fosterand Rosenzweig (1993)usedataon piecerateand time-rateworkersto investigateseveral implicationsofimperfectinformationon thepartofemployersthataredifferentfromtheone studiedhere. Theirresults implythattheincompletenessofemployerinformationisan importantissue. Studiesfollowingperformanceevaluations withinfirmsbasedon theEOPP data,orstudiesusingfirmpersonnelfiles(Medoffand Abraham (1980))arealso relevant,buthave averydifferentfocusthanthepresentpaper. Parsons(1986), Weiss (1995)and Carmichael (1989) provideusefuldiscussionsofsome ofthetheoreticalissueson thelinkbetweenwages andemployerperceptionsabout productivity. Albrecht(1982) conductsa testofscreeningmodelsofeducationbasedon theideathateducationwillhave lessimpacton theprobabilitya workerwillbe hirediftheworkerwas referredtothefirmby anotherworkerbecause some oftheinformationcontainedineducationwillbe transmittedthroughthereferral.Montgomery (1991)presentsa model inwhich employers obtainvaluableinformationon theproductivityofnew employersthroughreferralsand ispart ofalargeliteratureon labormarket networks. ForempiricalevidenceseeHolzer(1988). 3 correlated with hard to observe background variables that influence productivity.3 While our basic theoretical framework and most ofthe empirical analysis assumes that allemployers have the same information about workers, we provide a preliminary discussion ofthe implications ofmodels inwhich the current employer has an information advantage. In Section 2 we present our basic theoretical framework in a setting inwhich information is public, and then informally discuss the case inwhich itisprivate. W e also consider the effect that associations between s,z, and job trainingwould have on the analysis. In Section 3 we discuss the NLSY data used inthe study. In Section 4 and 5 we present our results for education and race. In section 6 we present results in which we control forjob training. In section 7 we discuss the case in which employer information isprivate and provide some evidence on how hard to observe variables are related to the probability of a layoffand the wage losses associated with layoffs. In section 8 we point out that interpreting our estimates of the time profile ofthe effect of AFQT on wages as the result of employer learning implies that high abilityworkers would have a substantial financial incentive to take the AFQT to differentiate themselves from those who are less able in this dimension. The factthat we do not generally observe this raises additional research questions. In section 9 we close the paper with a discussion of some ofthe implications of our analysis for a number of standard topics in labor economics and a research agenda. 2. Implications of Statistical Discrimination and Employer Learning for Wages. 2,1 A Model ofEmployer Learning and Wages In this section we show how the wage coefficients on characteristics that employers can observe directly and on characteristics they cannot observe directly will change with experience ifemployers statisticallydiscriminate and become better informed about workers over time. 3We areusingtheterm "statisticaldiscrimination" assynonymous withtheuseoftheterm“rationalexpectations” inthe economics literature. We mean thatintheabsenceoffullinformation,firmsdistinguishbetweenindividualswith differentcharacteristicsbasedon statisticalregularities. Inotherwords,we mean thatfirmsform stereotypesthatare rationalinthesensethattheyareconsistentwithreality. Many papersthatusetheterm statisticaldiscriminationanalyze raceorgenderdifferentialsthatarisebecausefirmshave troubleprocessingtheinformationtheyreceiveaboutthe performance ofminoritygroup members. Thisdifficultymay leadtonegativeoutcomes forminoritiesbecauseitlowers theirincentivestomake unobservableinvestmentsthatraiseproductivity. Italsomay leadtonegativeoutcomes ifthe productivityofajob match depends on thefitbetweentheworkerand thejob. Some papersalsoconsiderwhetherfirms thatstartwithincorrectbeliefsabouttherelationshipbetweenpersonalcharacteristicsand productivity(inaccurate stereotypes)would correctthem, and, inmodels withworkerinvestment,whetherthepriorsheldbyfirmsmay be self fulfilling. SeeAignerand Cain (1977),Lundberg and Startz(1983),Lang (1986), and CoateandLoury (1993)and Oettinger(1996). InOettinger’smodel productivityismatch specificandproductivitysignalsarenoisierforblacksthan whites. As a resultthesortingprocessacrossjobchanges islessefficientforblacks,and a racegap developsovertime. 4 Our model isvery similarto Farber and Gibbons (1996). Let y* be the log oflabor market productivity of worker iwith tiyears ofexperience, ya isdetermined by (1) yit= rsi+ H(ti) + aiqi + Azj + t|i where $ isyears of schooling, Ziisa vector ofcorrelates of productivity that are not observed directlyby employers but are available to the econometrician, and H(ti) isthe experience profile of productivity. The variable r\t consists of other determinants of productivity and isnot directly observed by the employers or the econometrician. The elements ofz* might be a test score, the income of an older sibling, father's education, or indicators ofchildhood environment such as books inthe home or ownership of a library card. W e normalize Ziso that allthe elements ofthe conformable coefficient vector A are positive. Without loss ofgenerality we scale r|iso that ithas a unit coefficient inthe productivity equation. In addition to Si,the employer observes a vector qiof other information about the worker that is relevant to productivity. The elements of qiare related to productivity by the coefficient vector cti. For now we assume that the experience profile ofproductivity does not depend on Si,Zi, qi,or r|i. In section 2.2 we discuss the sensitivity of our analysis to thisassumption. In most ofthe analysis we suppress the isubscript. All variables are expressed as deviations from population means. Although we use years of schooling and race as our examples of s, our analysis applies to any variablethat employers can easily observe. W e assume that the conditional expectations ofE(z|s,q) and E(r||s,q) are linearin q and s,so (2) z = E(z|s,q) + v = Y i q + Y2S + V (3) q = E(q|s,q) + e = a2s + e , where the vector v and the scalar e have mean 0 and are uncorrelated with q and sby definition ofan expectation.4 The links from s to z and q may be partiallydue to a causal effect of s.5 Equations (1), (2) and (3) imply that Av + e isthe error inthe employer's beliefabout the log ofproductivity ofthe worker at the time the worker enters the labor market. The sum Av+e isuncorrelated with s and q. W e make the additional assumption that Av +e isindependent of q and s. 4The exclusionofq from theconditionalmean ofq isinnocuous, sincewe aresimplydefiningq and thecoefficient vectoro£ionq in(1)sothatthemean ofq doesnotdepend on q. 5 Forexample,below we usetheArmed ForcesQualificationTest (AFQT) asz and yearsofeducationass,andNeal and Johnson (1996)presentevidencethatyearsofeducationhavea sizablepositiveeffecton AFQT. 5 Each period that a worker isin the labor market, firms observe a noisy signal ofthe productivity ofthe worker, (4) £t= y + et where y isyt-H(t) and etreflectstransitory variation inthe performance ofworker iand the effects of variation in the firm environment that are hard for the firm to control for in evaluating the worker. (We continue to suppress the isubscripts.) The term etisassumed to be independent ofthe other variables in the model. W e are also implicitly assuming that the component of etthat reflectstemporal variation in productivity from sources specific to worker iis seriallyuncorrelated, because otherwise firms would have an incentive to base compensation int+1 on what they know about the worker specific component ofet.6However, etmay be serially correlated as a result ofthe other factors. Since the employers know q and s,observing (5) isequivalent to observing dt= Av + e + 6, = £t-E(y|s,q) The vector D t={di,d2,...,dt} summarizes the worker's performance history. Let be the differencebetween Av+e and E(Av+e|Dt). By definition \it isuncorrelated with D t,q and s but in additionwe assume |itisdistributed independently ofD t,q and s. W e also assume for now that q, s,and the worker's performance history (summarized by the vector D t={di,d2...dt} are known to allemployers, as inFarber and Gibbons. (We discuss the private information case in Section 7.) As a result of competition among firms, the worker receives a wage W t equal to the expected value of productivity Y t(Yt= exp(yt))times the multiplicative error component exp(s,) that reflects measurement error and firm specific factors that are outside the model and are unrelated to s,z, and q. The wage model is (6) Wt = E(Yt|s,q,Dt)e* Using (1), (2), (3) and (6) leads to (7) Wt = E(Yt IS,q,Dt)ef*= ers+H(t) e(a,+An)q+(“2+A,'j)s eE(Av+e|Dl) E(e*) e* Taking logs and collecting terms leads to (8 ) w t= (r + AY2 + a2)s+ H*(t) + (Ayi + cti)q+ E(Av+e |Dt) + 6 The firm’ sknowledge ofa seriallycorrelatedproductivitycomponentwould implyseriallycorrelatedtransitoryvariation inthewage errorofthetypefoundby Faiberand Gibbons (1996),butwould nothave much effecton ouranalysis. 6 where w t= log(Wt) and H*(t) = H(t) +log(E(e^)). W e will suppress the Q term inthe equations that follow. In the context of the debate over signaling models of education Riley (1979) and others have noted that unless the relationship between schooling and actual productivity changes, the coefficient on s will not change. This istrue regardless ofwhv s isrelated to productivity. Farber and Gibbons also make thispoint by showing in a similar model that the expected value ofthe coefficient of an OLS regression ofw ton s does not depend on t. They estimate an equation ofthe form (8a) w t= bsts+ H(t) + otiq+ E(Av + e|Dt) with q treated as an error component. They find that brtdoes not depend much on t. Farber and Gibbons also make a second point, which isthat ifone adds the component z'of(Av + e) that isuncorrelated with the employer's initialinformation s and q to the wage equation and estimates (8b) w t= brts+ bz-tz'+ H(t) + aiq + E(Av + e|Dt), the coefficients on s do not depend on t.This follows almost immediately from the firstresult, because adding a second variable to a regression model has no effect on the expected value ofthe firstifthe two are uncorrelated. They provide evidence from NLSY that bjtisrelatively constant and bz>tisincreasing in t. In this paper, we establish a related set ofresults that permit one to examine the issue of statistical discrimination. W e begin with the case inwhich z and s are scalars and then consider the more general cases. Among those who are working the means ofq, s,and z may depend on t although this will influence estimates ofH*(t). However, we assume throughout that among those who are working the covariances among q, s, and z do not depend on t. Under these assumptions the variances and covariances involving q, s, and z and the regression coefficients <&q,and Oqz defined in (10) below do not vary with t.7 Case 1: z isa scalar. 7 EstimatesoftheexperienceprofileH*(t)willbe affectedifthemeans ofs,q,and zdepend on tbutthishas nobearing on ouranalysis. 7 The analysis iscleanest when z and s are scalars. Least squares regression will identifythe parameters ofthe expectation of w ton s,z, and experience profileH (t).8Let b,tand b* be the coefficients on s and z in the conditional expectation function when t=0...T, with (9) E(wt|s.z.t)=bstS+ bztz + H*(t) . When the individual starts work (tis0) this equation is (9a) E(wo |s.z.0)= bsos + bzoz + H*(0) To simplify the algebra but without any additional assumptions we re-interpret s, z, and q as components of s,z, and q that are orthogonal to H*. Then the wage process (7), the fact that E(Av+e|D0)is0 (since there isno work history when t=0), and some straightforward algebra involving the least squares regression omitted bias formula implies that 1 1 _ ~r + A y 2+cc 2 ~ 0. _bzo_ where <l>qs + Oqz and <5^ are the coefficients ofthe auxiliary regression of(cti+ Ayi)q on s and z.. The parameters {brt,b*}' are the sum ofthe {b,o, bzo}’and the coefficients ofthe regression ofE(Av+e|Dt) on s and z. That is, V bso i bzo_ \var(s,z)\ var(z) -cov (s,z) _-cov (s,z) var(s) O' c o v fv , E ( Av + e \ D t) where |Var(s,z)| isthe determinant ofVar(s,z) and we use the facts that cov(s, E(Av+e|Dt))=0 and cov(z,E(Av+e|Dt))=cov(v,E(Av+e|Dt)). This may be rewritten as bso bn. bzo_ i \var(s,z)\ var(z) -coy (s,z) O' Or _-coy(s,z) var(s)_ A v a r(v ) +cov(v,e)_ or (12a) b„ = bso + 0tOs (12b) bzt= bzo+ 0t<I>z where <J>,and <$zare the coefficients ofthe regression ofAv+e on s and z and 8 Technically, itidentifiesthecoefficientsoftheleastsquareslinearprojectionofwton s,z,and H*(t)ifE(Av+e|Dt)isnot linearinthefunctionsofs,z,andtwe introduceinourregressionmodels. We ignorethisdistinction. s 0t= cov(E(Av+e|Dt),z)/cov(Av+e, z)= cov(E(Av+e|Dt),v)/cov(Av+e, v) isa parameter that isspecific to the experience level t.Note that 0tO, and 0tO zare the coefficients of the regression ofE(Av+e|Dt) on s and z and that 0tsummarizes how much the firm knows about Av + e at time t.Itis easy to show (see Appendix 1) that 3>s = - where <!>*,isthe coefficient ofthe regression ofz on s. (This isthe basis of proposition 3 below.) To determine the behavior of 0tO, and 0t3>zover time, note firstthat O s< 0 and 4>z> 0 ifcov (v, Av+e) > 0 and cov(s,z) >0. The lattercondition istrue when s isschooling and the scalar z isAFQT, father's education, or the wage rate of an older sibling. The condition cov(v,Av+e) > 0 simply states that the unobserved (by the firm) productivity subcomponent v and composite unobserved productivity term Av+e have a positive covariance. This seems plausible to us for the z variables we consider. The change over time inbatand b* isdetermined by 0t.Intuitively, 0tisbounded between 0 and 1. Itis0 in period 0, because inthis period employers know nothing about Av + e, so E(Av+e|Do)=0. The coefficient is 1 ifE(Av+e|Dt) isAv+e, since inthis case the employer has learned what Av+e isand thus knows productivity y. Itisalso intuitivethat 0tisnondecreasing intbecause the additional information that arrives as the worker’s career progresses permits a tighter estimate ofAv+e.9 The regularity conditions on the etprocess that are required for the time average ofetto converge almost surely to 0 as tbecomes large constitute sufficient conditions for 0tto converge to 1 as tbecomes large. (See Theorem 3.47 in White (1984) for a very general set of conditions.) These conditions limit the degree ofindependence among the etand also restrict the variances. The intuition for this isthat future values of etmust be sufficientlyindependent ofthe earliere'sto average out, and must not be so variable that the future dtvalues have no new information about Av+e.10 A simple example may be helpful. Ifetisiidwith variance o€2,then 0thas the familiar form 9To establishthisnotethatsinceDt.iisa subsetoftheinformationinDt, [Cov(v,E(Av+e|Dt) -E(Av+e|Dt.i))]/Cov(v,Av+e) = 0t-0t.i* 0. 10To establishtheresultnotethatineachperiod,firmsobservedt=Av + e+ et.Ingeneral,theformofE(Av+eP0 will depend onthepatternofserialcorrelationand therelativevariancesofe^.e,. However, thefirmcanalwayschoosetouse t E(Av+e|Dt),where £>tisthetime average (Av + e + ek)/t,asan unbiasedbutperhaps inefficientestimatorgivenDt. i IfastgoestoinfinityD convergesalmostsurelytoAv+e, then Cov(E(Av+e|£>t,v)/Cov(Av+e,v) convergesto Iastgoes toinfinity. SinceE(Av+e|Dt) ismore efficientthanE(Av+e|D0|, E(Av+e|Dt)must alsoconvergealmostsurelytoAv+e, whichestablishesthatCov(E(Av+e|Dt),v)/Cov(Av+e, v)convergesto 1. We concludethat0tconverges almostsurelyto 1astbecomes large. 9 (13) = e, vart-Av^; for, „ 0o = O v a t ( h v + e) + o 2 c /1 In this case, 0tis strictlyincreasing intbecause the independence among the etmeans that each ethas some new information about 0t. 0t is0 when tis0 and converges to 1 as tgoes to infinity. There are two conclusions, which we summarize inProposition 1 and 2: P r o p o s it io n 1 : U n d e r the a ssu m p tio n s o f the a b o v e m o d el, the re g re s s io n c o e ffic ie n t bzt is n o n d e c re a s in g in t. T h e re g re s s io n c o e ffic ie n t bst is n o n in c re a s in g in t .11 P r o p o s itio n 2 : I f f ir m s h a v e co m p le te in fo rm a tio n a b o u t the p ro d u c tiv ity o f new w o rk ers, th en c b s/ d t = dbz/dt = 0. These results underlie our empirical analysis below. Using AFQT and father's education as z variables, we examine the experience profile ofbstand b*. The intuition for the decline inbstisthat as employers learn the productivity ofworkers, swill get less ofthe credit for an association with productivity that arises because s iscorrelated with z provided that z isincluded in the wage equation with a time dependent coefficient and can claim the credit.12 W e also are able to estimate the time profile of 0tup to scale. Under the assumption that employers learn about v and e at the same rate, this enables us to estimate the time profile ofemployer learning about productivity up to scale. In AP (1996) we examine the implications of our estimates for pure signaling models ofthe return to education. The model also implies a third result, which we state in proposition 3. P r o p o s itio n 3 : U n d e r the a ssu m p tio n s o f the a b o v e m o d el, c b s/d t = -0 a 3bzfd t. Since <E>»issimply the regression coefficient of z on s and can be estimated, the coefficient restriction in Proposition 3 may provide leverage in differentiatingbetween the leaming/statistical discrimination model and alternative explanations for the behavior ofba and b*. Additional Empirical Implications As noted in footnote 4, the literature on statistical discrimination as well as the literature on labor market networks has emphasized differences across groups inthe amount ofinformation that isavailable to firms (or the mapping between a given set ofdata and what the firm actually knows) may differ across 11 The coefficientson an unfavorablezcharacteristic,suchascriminalinvolvementoralcoholuse,willbecome more negativetotheextentthatthesereflectpermanenttraits. Assuming sisnegativelycorrelatedwiththeunfavorablez,b* willrisewitht As notedearlier,we have normalizedz sothatA > 0. 12Itmightbeparticularlyinterestingtoseeifthe "diplomaeffect"declineswithtwhilethecoefficientson hardtoobserve productivitycharacteristicsthatcorrelatewithgettingadiploma rise.(SeeFrazis(1993) fora recentanalysisofwhether thereisa diplomaeffect.) 10 groups. Our model implies that these differences will lead to group differences in wage dynamics. To see this, suppose that there are two groups, 1 and 2. For group 2 the firm's initialinformation set is larger than for group one. Consequently, Var(Av + e |group 2) < Var(Av + e |group 1) and cov(Av + e,v |group 2) < cov(Av + e, v |group 1). From equation (10') or (11), itfollows that brtand b* vary less over time for group 2 than group 1. In the extreme case, when firms are fullyinformed about group 2, cov(Av + e, v |group 2) is0 and ba and b* are constant. In future work, itwould be interesting to use this implication as a way oftesting the hypothesis that the quality ofinformation that employers have differs across labor force groups. Theories that stress differences inthe abilityof employers to evaluate the performance of members of different groups imply different amounts ofnoise (from the point of view of the employer) inthe signals dtand different paths of0t. In standard labor data sets based on household interviews information on yiisnot available. However, itisinteresting for at least two reasons to discuss the cross equation restrictionsbetween the equation relating y;to s and z that are implied by the model. First, data could be gathered from both firms and workers. Second, information on yior indicators ofyimay be available for use in other applications of our methods to study statisticaldiscrimination. For example, inthe study ofmortgage lending, panel data on households might provide data on both credit records (related to yO, success in loan applications (the counterpart to w^ ), and hard to observe background variables (such as the income and wealth ofrelatives). Suppose that one has a measure y;t*that isequal to yiplus noise Q t Assume that Citisindependent ofallother variables inthe model. Then the model implies that (13a) yit*= (b,o + <£«) Si+ (bzo + O z)z;+ error where the error term isorthogonal to s and z. Note that the coefficients are time invariant. This equation and (9) are heavily overidentified. By estimating the equationsjointly one can identify 0t separately from <£,and O z. The availability of a productivity indicator would be particularly useful when one relaxes the assumption that the effect of s and z on y istime invariant. Case 2: z Is a vector i W e now consider the case inwhich z isa vector z- {zi.Z2,...,z|c,..zk). In this case, bso _bzo_ __ 'r+A y2+ci2 40. _dV e •8 (14) 11 where [Oqs,O^]' isthe 1 x (K+l) vector ofcoefficients from the auxiliary regression of (ai + Ayi)q on s and z. In the vector case [bst,ba] are: (15) bst bso bzt. bzo. O' + var(s,z) cov(z,E(A v+ e|D„) where var(s,z) isa K+l x K+l matrix, cov(z, E(Av+e|Dt)) isa K element vector and we have used the fact that cov(s, E(Av+e|Dt))=0. Let Gk be the k* row ofthe K x K matrix G = [va x (s)va i(z) - c o v (z , s ) c o v ( z ,s f\ l where var(s) isthe variance ofthe scalar s,var(z) isthe variance ofthe vector z, and cov(z,s) isthe vector ofcovariances between s and the elements of z. In appendix 1 we show that bstand bzktcan be expressed as (16) bs, = n bso - X ( cov^’*J G* •©* •[cov(z,A v+ e)]) k=l (17) bat = bao+var(s)Gk©Lk[cov(z,Av + e)] where ©j^ = cov(E(Av+e|Dt),zk)/cov(Av+e, zk). Let be the coefficient ofthe regression of zk on s,k=l,..,,K.. Equations using (16) and (17) lead to Proposition 4, which isthe vector analog to Proposition 3. P r o p o s itio n 4 : W hen z is a v e c to r a n d the a ssu m p tio n s o f the a b o v e m o d e l h o ld , 5 bst d bat Proposition 2 generalizes to the vector case. Proposition 1 does not. With multiple z variables, one cannot ingeneral sign <9b,t/dt and do ^ I c i even ifallthe elements ofA are positive, each element of cov(z, Av + e) ispositive, and 4>ZtJ ispositive and ©J* isincreasing intfor allk. However, from Proposition 4, itfollows immediately that if c b z^ / & >0 and bZfct> 0 for allzkused 12 inthe analysis then d b j d i is< 0. W e can verify the conditions for a particular s and set ofz variables.13 Ifthe ©i* are the same for each ofthe Zk and equal to the common value 0t,then the time 1 __ © __ 1 1 — paths ofbstand the elements ofb* will allbe proportional to 0t: lAoJ The + 0 tr . . willbe the same for allk ifthe following two conditions on the conditional distribution f ofD tand the conditional mean ofZk hold: C o n d itio n 1. C o n d itio n 2. f(Dt,|zfc,Av+ e) = f(Dt|Av + e) for allk ; E(zk |Av + e) = <J>kR(Av + e) where <J>kisa scalarthat isspecific to k and R issome function. Basically, these conditions imply that the distribution ofthe signal D tisdriven by e + Av and that the signal D tisnot more informative about particular elements ofv than others.14 The condition will hold if,for example, dtisgenerated by (5) and e and the elements ofv are normally distributed. The conditions rule out the possibilitythat the range of a particular element ofv, say vk, iseither -100,000, 0, or 100,000. In this case, a very small or very large value ofD twould be very informative about vk. 13Initiallywe were surprisedthattheconditions thatA > 0, cov(z,Av + e)> 0,and isincreasingintforallk do not guaranteethatbItispositiveeven if<X>da> 0 forallk. The intuitionisasfollows. The OLS estimatorof is equivalenttoregressingthewage inperiodton theresidualsz k from theregressionofZkon s. z k isthesum ofvkplus thecomponentofthekthelementofyiqthatisorthogonaltos. The components yiQ thatareorthogonaltosare unrelatedtovkand ebutarelikelytobe correlatedacrosszk. Consequently,usingOLS toestimate isanalogousto applyingOLS ina situationinwhich severaloftheregressorsaremeasuredwitherror,and themeasurement errorsare correlated. (The z k may be thoughtofasnoisy measuresofvk.)Itispossibleinsuch a situationfortheprobabilitylimit oftheOLS estimatortotakeon thewrong sign. M To establishtheconditions,notefirstthat Cov((E(Av+e|Dt),zk)) = J JJzk-E(Av+e|Dt)•g(D,,Zk|Avk)•h(v) Av+czitDt = J JJZkE(Av +e|Dt)•g(Dt|Av) •f(zk|Av)•h(Av) Av+ezkDt = f JE(zk|Av)•E(E(Av+e|Dt)|Av) •h(Av) Av+ezk andthat Cov(Av+e,zk) = J E(zk|Av+e)-(Av+e)-h(Av+e). ItiseasytoverifyfromtheseequationsthatCov(E(Av + e|Dt, Av+e Zk)/Cov(Zk|Av+ e)isthesame forallZkif(E(Zk|Av+e) =4>k^'(Av+ e). 13 These conditions are quite strong. For example, ifthe firm obtains indicators about subcomponents ofv and e as well as y, then itislikelyto learn about some components of productivity faster than others. In this case, equation (16) and (17) continue to hold, but the time path ofthe education slope isa weighted average ofthe ©Lt that determine the time paths ofthe individual Zk. The paths ofthe individual Zk will reflect differences across Zkin the rate at which firms learn about the productivity components that they are correlated with. This isan important result, because itstates that differences inthe effects of particularvariables on wage growth may reflect differences inthe rate at which firms learn about the variables. This provides an alternative or a complement to the standard view that the differential effects on growth rates reflect differences inthe relationship between the variables and other sources ofwage growth such as on-the-job training. Case 3: s and z are both vectors. Finally, we consider the case in which both s and z are vectors. In this case we reinterpret allof the ofthe related variables and parameters inthe model, such as b^, bzo, v, etc as vectors or matrices. The vectors ofcoefficients bso and bz<>on s and z inthe base year satisfy (14) where the vectors and (17b) are the coefficients inthe regression of q on s and z. The vectors ba and b* are given by V Ac" A. A. + /4[cov(v,E { Av+ e\D t)] where -var(5) 1cov(s,z)[var(z)-cov(z,s)var(.s) 'cov(s,z)] A = [var(z)- cov(z,s ) var(s)~' cov(s,z)] Since -var(s)'1cov(s,z) isthe matrix of coefficients from the regression of z on s we obtain the vector version ofproposition 4: ^si ~ bs0 — ~ b zo)> where O a isredefined to be the matrix of coefficients ofthe regression ofthe vector z on the vector s. When conditions 1 and 2 are satisfied and the signal D tisnot more informative about particular elements ofv than others then (17b) reduces to (17a) with both ba and b^ as vectors. 14 Statistical Discrimination on the Basis ofRace Firms observe race. Ifrace is correlated with productivity and firms violate the law and use race as information, then race has the properties ofan s variable. To see the empirical implications ofthis, partition s into two variables, Si and s2,where Si isan indicator variable for membership in a particular racialgroup and 0 otherwise, and s2 isschooling.15 In thiscase, the model implies almost immediately that the coefficient on Si does not vary over time ifthe interaction between z and tis excluded from the model. Ifthis interaction isincluded (17a) implies that the time paths ofb,n and bsa are bjit b.io - , bs^bs^ - <bza2&l where and Oz,2are the coefficients on si and s2in the regression ofz on Sj and s2. Assuming <I>2siisnegative, as itiswhen Si indicates that the person isblack and z isAFQT, father’s education, or the wage ofan older sibling, then the wage coefficient on Si will rise over time. In contrast, iffirms obey the law and do not use race as information, then in the econometric model, race has the properties of a z variable. In the case inwhich race isthe only z variable and one svariable, such as education, isincluded in the analysis, then the coefficient on z in equation (11) corresponds to the coefficient on race. The model implies that if(i)race isnegatively related to productivity (A < 0), (ii)firms do not statisticallydiscriminate on basis ofrace, and (iii)firms learn over time, then the race differential will widen as experience accumulates. The intuition isthat with learning firms are acquiring additional information about performance that may legitimatelybe used to differentiate among workers. Ifrace isnegatively related to productivity, then the new information will lead to a decline inwages. Ifeducation isnegatively related to race, then the coefficient on education should fall over time. What happens iffirms do not discriminate on the basis ofrace and one adds a second z variable with a time varying coefficient to a model that contains race and an svariable? Let zi denote race and z2denote the additional variable, and letbzi,denote the coefficient on race when experience istand z2 isincluded in the model and letbzit*denote the corresponding coefficient when 15The elementofrcorrespondingtotheraceindicatorSiintheproductivityequation(1)is0 unlessconsumeror employeetastesfordiscriminationreduceprofitabilityofemploying members oftheminoritygroup, asinBecker(1971). (Even ifris0 racemay be negativelyrelatedtoproductivityifitiscorrelatedwith elementsofz,q,orq thataffect productivity.)Presumably,firmsthatviolatethelaw and discriminateinresponsetotheirown prejudiceortheprejudice ofconsumers orotheremployees might alsobewillingtouse raceasinformation. Employers who harborprejudice againstcertaingroups may be especiallyunlikelytoform beliefsabouttheproductivity ofthosegroupsthatarerationalin thestatisticalsenseused inthispaper. 15 z2isexcluded. Assume that 0{, =022 = 9 t where 0^ isdefined below (17) above. In Appendix 3 we show that /s t - / a =- e o t / a ■ [®^ ] where O z2 isthe coefficient on z2 in the regression ofAv + e on s, z, and z2and <3>Vi isthe coefficient on Z\ inthe regression ofz2on zi and s. When zi indicates whether the person isblack and z2isAFQT, father’s education, or the wage ofan older sibling, variables are positively related to productivity, with isnegative. Ifthese > 0 then cb zll /d t - cb zlt*/dt > 0. We conclude that iffirms do not statisticallydiscriminate on the basis ofrace and race isnegatively related to productivity, then (1) the race gap will widen with experience and (2) adding a favorable z variable to the model will reduce the race difference inthe experience profile. W e wish to stress that other factors that influence race differences in experience profiles as well as other forms of discrimination will also influence the wage results. 2.2 Incorporating On-the-Job Training Into the Model: The analysis so far assumes that the effects ofz and s on the log ofproductivity do not depend on t.Human capital accumulation isincluded inthe model through the H(t) function but is assumed to be “neutral” inthe sense that itdoes not influence the time paths ofthe effects ofs and z.16 In the more general case, the time paths ofz and s depend on other factors as well as learning. In this sectionwe firstconsider the effect that such dependence would have on OLS estimates ofthe interactions between tand z and s. Then we discuss estimation ofa more realisticmodel that includes both human capital accumulation and leaming/statistical discrimination. As we shall see, there isno clean way to sort out the relative roles ofthese two mechanisms without data on productivity. Suppose that s iscomplementary with learning by doing or enhances the productivity of investments in general skills. W e return to the case ofscalar z. Then the productivity equation (net oftraining costs) might take the form (18) yt= r s + ns t+ H(t) + ciiq+Az + r|. 16 One may easilymodifythetheoreticalframework toallowforthisform ofhuman capitalaccumulation. Forexample, theH(t)functionmay reflectlearningby doinginalljobsthatisobservabletofirms,orworkerfinancedinvestmentsin human capitalthatareobservabletofirms. 16 Assuming that the training activity isobserved (firms know (18)) and workers pay for the general training, the wage equation (9) becomes (19) w t= (bjt+ rit) s + baz + H(t) + aiq + E(Av + e|Dt) Most discussions ofhuman capital and most ofthe empirical evidence on employer provided training suggest that education makes workers more trainable and that educated workers receive more training. In this case n will be greater than 0.17 Probit models ofthe probability that a worker receives training in a given year show strong positive effects of schooling, and AFQT as well as smaller but positive, statistically significant effect offather’s education. (See below.) What are the implications ofthis for our investigation ofthe hypothesis that the reliance of employers on easily observable variables to estimate productivity declines over the career? In estimating the model we identify the sum bst+rist rather than b,t.Ifri isgreater than 0, then the estimated relationship between b* + nst and twill be biased against the hypothesis that employers learn about productivity. As itturns out, we find a strong negative relationship between brt+ nst and t,which isonly consistent with a training interpretation ifeducation reduces learning by doing, the productivity oftraining investments, and/or the quantity oftraining investments. There isalso the possibilitythat the productivity ofemployer provided training and/or learning by doing depends on z and/or t). This case isharder to analyze because employers do not observe z and T|directly and are learning more about them as time goes on. As a start, we consider the extreme case in which firms are fullyinformed about z, so that Gt is 1 and b* in (9) isa constant inthe absence oftraining. Suppose that the productivity equation is (20) yt= r s + ns t+ r2z t+ H(t) + atq +Az + q , r2> 0 Iffirm's knowledge of s and q isfully informative about z, then the presence of r2inthe productivity equation should lead the effect of z on the wage to risewith experience even ifb* does not depend on time (0t=l). However, the presence ofr2z tinthe productivity equation seems unlikely to lead to a negative estimate ofd b j d i . Itisimportant to point out, however, that ifthe effect ofz on y rises with tthen introducing the interaction between z and tinto the wage equation could lower the estimate ofthe change over 17Earningsslopesdepend on theexpectedproductivityoftheworkerifthecostsorreturnstotrainingdepend on variables suchaszors. Altonjiand Spletzer(1992)finda relationshipbetweentestscoresand measuresoftrainingusingthe NLS72 dataset,and many studiesfinda linkbetween schoolingandtrainingmeasures. Seeforexample,Barteland Sicherman (1992)and Lynch (1992). 17 time inthe wage response to s. Let Bstbe the expectation ofthe OLS estimator ofthe effect of s on the wage in period t. Then Bstisbso + rjt+ r21,where $>*,isthe coefficient ofthe regression of z on s. When one adds z-tto the regression, Bstbecomes bso+ rit. B*, the expectation ofthe OLS estimator ofthe effect of z in period t,becomes bzo + r21. If > 0 and r2>0, then r2> 0. The change inthe coefficient on s when z tisadded is -<J>zsr2t. Consequently, inthe scalar case the simple training model with fullinformation about z implies that [dBst/dt] declines by - <t>zs [0Ba/0t] when z tisadded to the wage equation. In the pure employer leaming/statistical discrimination model 0B,t/0t isequal to d b j dt and, according to proposition 3, the learning model also implies that 0B»t/0t declines by -Ozsdb^ -Oz,0t when z tisadded to the wage equation. However, models differintheir implications for the le v e l ofd B J d t after z tisadded. A pure human capital model with perfect information implies 0Bst/dt > 0 unless, in contrast to the available evidence, s has a negative partial effect on the quantity or return to on-the-job training (n < 0). Controlling for Training. In the absence of data on productivity, sorting out the relative importance ofemployer learning and non neutral (with respect to z and s) on-the-job training may require that one build a model ofthe quantity oftraining as a function of s and z and use a proxy based on the training model to control for the effects of non neutral general human capital accumulation inthe wage equation. This raises a number of difficultiesthat we explore inthe next few paragraphs. W e return to the case of scalar z. Assume that the productivity equation (net oftraining costs) takes the form (21) yt= r s + 'F(STit) - C(Tit) + H(t) + a lq + r \ . where ETjt= ST=i..tTit,¥(.) isan increasing function that summarizes the effect of accumulated training on productivity, and C(Tu) isthe cost interms ofthe log ofproductivity of Titunits of training in period t, and the function H(t) has been redefined to accommodate the inclusion of training. Assume that T;tisdetermined by employer beliefs about productivity given D t,q, s,and t, as well as by D t,q, s, and experience. Then (22) Tit= h(Dt,q, s,t) = r(s,z,t) + ut 18 where r(s,z,t) is E(h(Dt, q, s, t)|s,z,t) and ut is an error term that is related to q and D, but is assumed independent of s, z, and t. Following through on a series of substitutions that parallels those leading to (8), and assuming that the worker pays for and receives the returns to the general training yields the wage equation (23) w, = (r + Y2 + a 2)s + ¥ ( 2 Tix) - C(Tit) + H*(t) + A(Yl + ce,)q + E(Av+e |Dt) + Suppose that up to an irrelevant constant 'F(2x=i.., Tix)) = vj/i Ex=i..t TjXand C(Tit) = Ci Tu. Then the regression function relating wt to s ,z , 2 TiX) and Tu in period t may be written as (24) wt = (r + au)s + (H/i+a2t)2 Tix + (a3t -Ci)Tit + a4t z + H*(t) + where an, a2t) a3t, and a4t are the coefficients of the linear least squares projection of A(yi + ai)q + E(Av+e |Dt) onto s, 2 T *, Tit, and z, and the error term is unrelated to the variables in the model by definition of an a2t, a3t, and a*. The time path of au and a4t will be influenced by changes over time in the correlations of s, z, and A(yi + oti)q with 2 T;x and Tit as well as changes over time in the correlations of z, 2 TiX, and Tu with E(Av+e |Dt). (The coefficients of the experience profile H*(t) will be influenced as well.) Two implications follow from (23) and (24). First, even if training depends only on information that is known to the firm at the start, the relationship between q and Tu and ET* may change with t, leading to changes over time in the coefficient on s even if there is no learning. The second point follows from the fact that training depends on Dtand so will be correlated with it. The least squares estimates of the coefficients on the training variables will reflect both the direct effect of training and a relationship between the time path of T;t and E(Av+e |Dt). As a result, the effect of adding the training variables to the model on b* and b* is complicated in a mixed human capital/employer learning model. In particular, one might expect the addition of functions of Tu and 2T;Xto the model to change and quite possibly reduce the rate of increase of b*, for two reasons. First, the training variables change over time and are positively correlated with z. Second, they will absorb part of the trend in E(Av+e |Dt), and it is changes in this term that induce the variation with t in bsl and b*. Furthermore, the introduction of the training terms alters the partial correlation between z and s, which changes the effect on the path of b8t of introducing z with a time varying coefficient. Unfortunately, we have do not have a way to isolate the effects of training from the effects of statistical discrimination with learning if, as seems plausible, the quantity of 19 training is influenced by the employer beliefs about productivity. Consider the null hypothesis that (1) learning is important, (2) variation with s and z in the rate of skill accumulation is not, and (3) variation in our measure of training is driven by worker performance (which leads to promotion into jobs that offer training) rather than by exogenous differences in the level of human capital investment. Even under this hypothesis one would expect the introduction of the training measures to lead to a reduction in the growth over time in the coefficient on z and a reduction in the impact of z on the time path of the coefficient on s. With an indicator of ya, that problem is easily solved, but we lack such an indicator. Despite the absence of a clear structural interpretation of the results we think it is important in this initial study to see how introducing measures of training alters b* and b * . Consequently, below we report estimates of (24). There are two additional problems in using the training data. First, the measure T*jt of Tit is almost certain to contain measurement error. Second, the quality of the training data prior to 1988 is too poor to be used, which means that the data needed to form the measure ET;t is missing for persons who left school prior to that year. We do not have a solution for the first problem but deal with the latter problem by estimating a flexible model relating T*uto s, z, and t using data from 1988-1993 and using the model to impute values in the earlier years.18We estimate variants of (24) below. Our preferred specification is a wage growth model based on the first difference of (24). The growth specification has the advantage of only requiring data on Tu and Tin. Perhaps more importantly, this specification also eliminates bias from unobserved person specific effects that are known to firms and are correlated with both training and wages. 3. Data The empirical analysis is based on the 1992 release of NLSY. The NLSY is a panel study of men and women who were aged 14-22 in 1978. Sample members have been surveyed annually since 1979. (In 1994 the NLSY moved to a biannual survey schedule.) The NLSY is an attractive data set for the study of employer learning and statistical discrimination. First, the sample sizes are large. Second, sample members are observed at or near the start of their work careers and are followed for several years. Third, the NLSY contains detailed employment histories, including reasons for job changes. Fourth, it contains a rich set of personal characteristics that may be related to productivity and may be hard for employers to observe, including father and mother's education 18 Spletzer and Lowenstein (1996) provide means ofdealing with measurement error inthe trainingdatabutthese are beyond the scope ofour study. 20 and occupation, drug and alcohol use, criminal activity, AFQT, aspirations and motivation, and performance in school. Furthermore, the data set contains a large number of siblings. The earnings of older siblings as well as parents may be used as indicators of characteristics of younger siblings that affect productivity but are hard for employers to observe. Finally, it contains measures of training, which we need to investigate the possibility that variation with experience in the effects of schooling and our measures of hard to observe personal characteristics are due to a relationship between these variables and the quantity of training received. We restrict the analysis to men who are white or black who have completed 8 or more years of education. We exclude labor market observations prior to the first time that a person leaves school and accumulate experience from that point. When we analyze wage changes, we further restrict the sample to persons who do not change education between successive years. Actual experience is the number of weeks in which the person worked more than 30 hours divided by 50. Potential experience is defined as age minus years of schooling minus 6. To reduce the influence of outliers, father's education (F_ED) is set to 4 if father's education is reported to be less than 4. AFQT is standardized by age.19 The means, standard deviations, minimum and maximums of the variables used in analysis are provided in Table A1 in the Appendix, along with the variable definitions. The mean of actual experience is 4.9. The mean of potential experience is 7.3, and the mean of education is 12.7. All statistics in the paper are unweighted. Blacks are over sampled in the NLSY and contribute 28.8 percent of our observations. Table A2 reports correlation coefficients and simple regression coefficients that summarize the relationships among the key variables used in the analysis. 4. Results for Education In Table 1-3 we report estimates of our basic wage level specification. In table 1 we use potential experience as the experience measure and use OLS to estimate the model. The equations also control for a cubic in experience, a quadratic time trend, residence in an urban area, and dummy 19 The age ofthe sample members atthe time the AFQT was administeredvaries somewhat in the NLSY sample. This induces some variation in schooling levelsatthetime the AFQT istaken. To calculate standardized AFQT, we adjustthe raw AFQT scoreby subtractingthe mean score foreach age and dividingby the standarddeviationforthatage. For individualswith siblings inthe sample, the coefficientsofthe regressionofthe unadjusted testscore oftheolder sibling on the testscore oftheyounger siblingand the regression ofthe testscore oftheyounger siblingon thethe score ofthe oldersiblingarevery similarafterone also controls forage, suggestingthatthe information inthe testisnotvery sensitivetoage over the range in the sample. 21 variables for whether father's education is missing and whether AFQT is missing. We add interactions between the dummy variables for missing data and experience when interactions between father's education and experience and AFQT and experience are added to the model. These variables are not reported in the tables. Standard errors are White/Huber standard errors computed accounting for the fact that there are multiple observations for each worker. In column 4 we present an equation that includes s, Black, and sxt. This corresponds to (7a) with bst restricted to b* = bso + bsi*t. The coefficient on s*t/10 is -.0077 (.0062), suggesting that the effect of education on wages declines slightly with experience. In column 5 we add AFQT and F_ED, where F_ED is years of father's education. As had been well documented, AFQT has a powerful association with earnings even after controlling for education. A shift in AFQT from one standard deviation below the mean to one standard deviation above is associated with an increase in the log wage o f. 157. The coefficient on education declines to .080 and bsi becomes more negative. In column 6 we add linear interactions between t and two different z variables, AFQT and F_ED, to the equation. The resulting equation corresponds to (9) with the restriction that brt = b,o + b,i*t and bzt= bzo + bzixt, except that we introduce two z variables rather than 1. The estimates imply that the effect of AFQT on the wage increases greatly with experience t. bAFtyn, which is the coefficient on AFQTxt/10, is .0820 (.0125). bAFQTt, which is dwt/3AFQT, rises from only .0179 when experience is 0 to .0999 when experience is 10. The results imply that when experience is 10 and education is held constant, persons with AFQT scores one standard deviation above the mean have a log wage that is .200 larger than persons with AFQT scores one standard deviation below the mean, while the difference is only .036 when experience is 0. The effect of father's education also increases with experience. The main effect is actually slightly negative (but not significant). However, the interaction term is positive, though not statistically significant. Our results for AFQT and F_ED are consistent with Farber and Gibbon's results in which they use the components of AFQT and an indicator for whether the family had a library card when the person is 14 that are orthogonal to the wage on the first job and education. The key result in the table is that the coefficient on sxt/10 declines sharply (to -.0351 (.0069)) when AFQTxt and F_EDxt are added. The implied effect of an extra year of education for a person with 10 years of experience is only .0633. Strikingly, the coefficient on s rises to .0984 which is almost exactly what we obtain when we exclude all terms involving F_ED and AFQT from the model (columns 1 and 4). 22 These results provide support for the hypothesis that employers have limited information about the productivity of labor force entrants and statistically discriminate on the basis of education. Early wages are based on expected productivity conditional on easily observable variables such as education. As experience accumulates, wages become more strongly related to variables that are likely to be correlated with productivity but hard for the employer to observe directly. When we condition the experience profile of earnings on both easy to observe variables, such as education, and hard to observe variables, such as AFQT and father's education, we find the partial effect of the easy to observe variables declines substantially with experience. While one might argue that the positive coefficients on AFQTxt and F_ED*t are due to an association between these variables and training intensity, it is hard to reconcile this view with the negative coefficient on sxt. While measurement error in schooling may enhance the effect of F_ED and AFQT and may partially explain the decline in s between columns 1 and 3, it does not provide a simple explanation for the behavior of the interaction terms with experience. In Table 2 we present OLS results using actual experience in place of potential experience as the experience measure t. The main difference between this table and table 1 is that the return to education is lower and the s*t interaction is positive and fairly large in the equations that exclude AFQTxt and FJEDxt. However, the coefficient on sxt/10 declines from .0200 in column 5 to .0056 when the interaction terms are added in column 6 of Table 2. This decline is similar to the decline that we obtain in column 3. The results in Table 2 are difficult to interpret, because the intensity of work experience may be conveying information to employers about worker quality. It is an outcome measure itself. The implications of employer learning for the wage equation are changed if one conditions on information that becomes available to employers as the worker's career unfolds and may reflect the productivity of the worker. Conditioning on actual work experience raises some of the issues that would arise if we conditioned on wages in t-1 or on training received. On the other hand, the results based on potential experience are likely to be biased by the fact that potential experience mismeasures actual. For this reason, in Table 3 we report the results of re-estimating the models by instrumental variables (IV), treating all terms involving actual experience as endogenous with corresponding terms involving potential experience as the instruments. The results in columns 5 and 6 of Table 3 are basically consistent with those in Table 1. The coefficient on AFQT is .0177 (.0096) and the coefficient on AFQT-t/10 is .1148 (.0164). These estimates imply that conditional on years of schooling, AFQT has only a small effect on initial wages, but when t is 10 a two 23 standard deviation shift in AFQT is associated with a wage differential of .247. The coefficient on s-t/10 declines from -.0181 when the interactions are excluded in column 5 to -.0561 in column 6. Controlling for Secular change in the Return to Education In column 9 of Tablesl, 2, and 3 we add the interaction between s and calendar time to the model containing father's education and AFQT.20 In the case of potential experience in Table 1, the education slope is reduced by .02 per year, and the interaction between education and experience/10 drops to -.051, but otherwise the results change little. In column 10 we add the interactions between calendar time and s, F_ED, and AFQT to the model containing the interactions between t and all three variables. In column 10 the interactions between F_ED and AFQT and calendar time have positive coefficients, indicating that the effects of these variables rose during the 1980s. Adding the time interactions reduces the size of the experience interactions with F ED and AFQT, but the qualitative pattern of the results does not change. Controlling for Occupation One objection to the theoretical framework underlying the estimates in Tables 1-3 is that it assumes that the flow of information to employers is independent of the type of job the worker begins in. This is contrary to the idea that some jobs are "dead end" jobs. Perhaps education (and high AFQT) enables a worker to gain access to jobs in which firms have the ability to observe whether the worker has higher level skills that are strongly related to productivity. As a simple check on this possibility, we present a series of equations in Table 4 that control for the 2-digit occupation of the first job. The results are very similar to what we obtain when occupation is excluded.21. The Effects of the Wage of a Sibling 20Murphy and Welch (1992), Katz and Murphy (1992), Taber (1996) and Chay and Lee (1997) are among a large number ofrecentstudy ofchanges in the structureofwages intheU.S.. Since calendartime ispositivelycorrelatedwith experience tina panel data set,the learning/statisticaldiscrimination model implies thatestimates ofsecularchanges in the return toeducation and AFQT willbe biased inoppositedirectionsifone failstoadd the interactionbetween these variablesand ttothe model. 21 An interestingprojectforfuture research would be touse informationfrom the Dictionary ofOccupational Titleson skillrequirements ofoccupations and tracehow easy to observe and hard toobserve productivitycharacteristicsare related tochanges over a career in the skillrequirements ofthejob a worker holds. Itwould alsobe interestingtoexamine how the slopes are influencedby the skillrequirements ofthe initialoccupation heldby the individual. 24 In Table 5, we use the wages of siblings with 5 to 8 years of experience as a hard to observe background characteristic. The coefficient on sxt/10 is -.0097 (.0089) in column 4, which includes the log of the wage of the oldest sibling. The learning model does not provide an explanation for the negative interaction term, nor does the conventional view of how education is related to on-thejob training. However, when we add the interaction between the sibling wage and t in column 5, the coefficient on the education interaction falls to -.0146, and the coefficient on the interaction between the sibling wage and t/10 is .086 (.0327) . 22 The effect of the sibling wage rises from .127 upon labor force entry to .213 after 10 years of experience—a very large increase. The point estimate of the interaction between education and experience result is essentially unchanged when we allow the effect of sibling wage. In Table 5, columns 5 and 6, we show that these results are robust to allowing the effects of education and the sibling wage to depend on calendar time. Our interpretation of these results begins with the premise that the labor market productivity of siblings are correlated. As a worker acquires experience this correlation is reflected in the performance record Dt and in wage rates. The sibling wage is correlated with education, and so the effect of education on the wage declines with experience because firms are estimating productivity with a bigger information set than at the time of labor force entry.23 The Experience Profile of the Effects of AFOT and Education on Wages In this section we take a more detailed look at how the effects of AFQT and s vary with experience by estimating models of the form wt = f(z,t;bz) + h(s,t;b,) + H(t) + eit where bz and b, are now vectors of parameters. Table 6 is based on models in which f(z,t;bz) and h(s,t;b,) are quartic polynomials in t. In the top panel, the experience measure is potential 22The corresponding point estimates are -.022 and .080 when we allow the effectsofeducation and the siblingwage to depend on calendartime. 23Farber and Gibbons (1996) use men and women, include Hispanics, and restricttheirsample topersons who have worked atleastthreeconsecutiveyears since attending school. Using thissample thecoefficientson AFQT*t and the effecton s*t ofadding AFQTxt are similarto those reported above. We alsoobtain qualitativelysimilarresultswhen we followFarber and Gibbons and use the levelofwages ratherthan the log. We experimented with an indicatorfor whether any person in the respondent's household had a librarycard atthe time the respondent was 14, a variablewhich Farberand Gibbons also used. We confirm Farber and Gibbons' finding thatthe coefficienton the residual from a regressionofthisvariable on the initialreal wage, education, part-time status, an interactionbetween education and parttime status, race, sex, age, and calendaryear increaseswith experience, as well as theirfindingthatthe resultsforlibrary card and AFQT are weakened substantiallywhen thesevariablesare interactedwith calendartime. However, when we use the librarycard variable itselfthe effectofthe librarycard variablefallsratherthan riseswith experience. We thank Henry Farber forassistingus in reconstructing the Farber and Gibbons sample. 25 experience; in the bottom panel we use actual experience instrumented by potential experience. All of the models in the tables contain the other control variables discussed above. They also include F ED and F_EDxt. The columns report dwt/dAFQT, 02wt/3AFQT, dt dwjds, and d2wt/dsdt at various experience levels. The first column of the table shows that <9wt/c3AFQT increases steadily from .0197 when t is 0 to . 121 when t is 12. (We only go out to t=12 because sample information becomes thin at higher values.) The specification that we use in most of the paper, in which f(z,t;bz) and h(s,t;b„) are linear in t (column 6 in tables 1-3), suggests an increase in dwt/3AFQT from .0179 to . 116 as t goes from 0 to 12. As noted earlier, employer learning implies that dwt/dAFQT is nondecreasing in t ( i.e., d2wt/3AFQT,dt >0), with a strict inequality likely if some new information arrives each period on y. If the noise in observations of ytare iid, then the rate of increase d2wt/3AFQT,dt should decline with t, as shown in expression (12c) for 0t above. The rate of increase must decline eventually because the amount of additional information in observations of labor market performance is declining. (0t is bounded at 1.) However, it is possible that the first two or three observations on a worker are particularly noisy because of factors that we have left out of the model. For example job specific or occupation specific match quality may be more variable for new workers than more experienced ones. In column 2 we report d2wt/dAFQT,dt for various experience levels. The values increase from .0025 when t is 0 to .0104 when t is 5, remains at about this level until t is 8 (the maximum is .0108 at t = 6.5) and then decline to .0048 when t is 12. These results are reasonably consistent with a decline in the amount of new information with experience after an initial period of noisy observations.24 In panel B we replace potential experience with actual experience, and treat actual experience as endogenous. The 99th percentile value for this variable is only 13.33, so there not much sample information on t beyond this point. In column 1 we see that the effect of AFQT increases with experience. The rate of increase d2Wt/<9AFQT,dt rises at first from .0092 when t = 0 to .0138 when t=5, but declines to -.0012 when t = 12. However, the standard errors on these*25 24 We used two other non-linear specifications. The firstused splinefunctions with break points att=2, t=4, t=7,and t=10. In the second we restrictedf(z,t;b2)so thatdV/SAFQT 5t= 0 when tis25 and h(s,t;b,) so thatdVt/ds 3t= 0 when tis 25. The idea isthatthe information about productivitythatiscontained in AFQT isMly revealedby thetime tis25. Both ofthese specificationsyielded results similar to the reported model inwhich 32wt/c?AFQT dt isflator increasingand then definitivelydecreasing afterabout 7 years. 26 derivatives are quite large. These results are also loosely consistent with the proposition that the rate at which new information about initial productivity arrives declines with experience, but the estimates are not sufficiently precise to say much about this. As the NLSY sample ages, it will be interesting to revisit the issue. In the model with potential experience, the return to education increases slightly between t=0 to t=3, and then declines sharply. In the model with actual experience, the decline is constant throughout from .0881 at no experience to .0299 at 12 years of experience. Testing the restrictions on the experience profiles of the effects of s and z on the wage. It is interesting to see how well the experience profiles of the education and AFQT coefficients satisfy the restrictions in propositions 3 and 4. One complication in performing these tests is the place of race within our model — should we treat race an s variable or a z variable? The answer to this question hinges on the extent to which employers violate the law and use race as an indicator of productivity. We discuss this at length in section 5 below. For now we will side step the issue by running separate tests on the white and black samples. Consider first a specification in which s and z are both scalars, education and AFQT score, respectively. Proposition 3 says that the product of -cov(s,z)/var(s) — the negative of the coefficient of the regression of z on s — times the coefficient on the interaction between AFQT and experience (z*t) should equal the coefficient on the interaction between education and experience (s*t). In the white sample, the product is -.00162 and the coefficient on s><t is -.00232. A Wald test does not reject the proposition. In the black sample the corresponding numbers are -.00196 and -.00498 and the proposition is rejected.25 We might also want to test whether the entire profile of the interactions between s and t and between z and t are in accordance with proposition 3. One way to do this is to estimate the model in which the interactions are specified as fourth-order polynomials and jointly test whether the coefficients on the four polynomial interactions are related by the coefficient of the regression of z on s. This seems a bit restrictive in that we only expect the relationship to hold over the range of observed data and polynomials that have very different coefficients can be fairly similar over a short range. However, we performed these tests on models in which the interactions of AFQT and education with experience are modeled as fourth order polynomials. Once again, we fail to reject the proposition for whites but reject for blacks. 25Itshouldbe noted thatthe standard errors forthese testsdo not account forpossible heteroscedasticityin thedata. 27 We also tested proposition 4, the vector analog of proposition 3, on models which include both AFQT and father’s education. We also considered as z variables the dummy variables indicating whether these quantities were known. This test amounts to a t-test of whether sum of the products of -cov(s,z)/var(s) and the coefficient on zxt for each z variable is equal to the coefficient on s*t. For whites, the sum of the products equals -.00193, the coefficient on s*t is -.00254, and the proposition is not rejected. For blacks, we obtain -.00166 and -.00456 and reject the proposition. Wage Growth Equations In Table 7 we estimate (9) in first difference form. We restrict bstto be b^ + b,i t and b* to be bzo + bzi t. The usual reason for working in first differences is to eliminate correlation between the regressors and a fixed error component. This motivation is not compelling in the present case. However, it is possible that the first difference specification may be less sensitive to errors in identifying when individuals start their careers. Columns 1-4 report OLS estimates with potential experience. The coefficient on the sxAt will pick up the effects of secular changes in the return to education as well as the changes with experience in the return to education. The upward secular trend in the return to schooling may partially explain the fact that the s*At has a positive coefficient in the basic model in column 1 while it is negative for the corresponding level specification in Table 1, column 4.26 (A secular trend in the return to education or AFQT matters less when estimating the equations in levels because much of the variation in experience is across persons of different ages). Also, the estimates are much less precise when we estimate in first difference form. However, the key results are qualitatively similar to the level specifications. In particular, the coefficient on s/10 declines from .0148 (.0094) in column 1 to -.0092 (.0110) when we add the AFQT and F_ED interaction terms in column 2. The size of the decline in this coefficient is very similar to the drop in the coefficient on sxt when we add AFQTxt and F_EDxt to the level specifications. (See columns 5 and 6 of Table 1). The AFQT interaction term is positive with a t value of 3.4. The F ED interaction is also positive and similar in magnitude to the result obtained in levels, but it is not statistically significant. Columns 5-8 reports IV estimates of wage growth equations using actual experience as the experience measure. The coefficient on AFQTxAt/10 is .0905 (.0197), which compares to the value 26See Muiphy and Welch (1991) and many subsequent studies. Mumane etal (1995) provide evidence ofan increase in the returntoaptitude and achievement, as measured by tests. 28 o f. 1148 in Table 3, column 5. The coefficient on sxAt/10 declines from .0295 (.0079) to -.0030 (.0100) when AFQTxAt/10 and F_ED><At/10 are added. S. Do Employers Statistically Discriminate on the Basis of Race? Thus far we have focused the discussion on employers' use of education as an indicator of labor market productivity. In this section we examine the role of race. By almost any measure, young black men are disadvantaged relative to whites in the U. S.. On average, black males have poorer, less educated parents, are more likely to grow up in a single parent household, live in more troubled neighborhoods, attend schools with fewer resources, and have fewer opportunities for teenage employment than white males. Many of these factors are correlated with educational attainment and labor market success. They are likely to lead to a black/white differential in the average skills of young workers. Discrimination in various forms may further hinder the development of human capital in black children, and add to a gap in skills that is due to the race difference in socioeconomic background. The gap in some indicators of skill are very large. In our sample, the mean percentile score on the AFQT for the black sample is 23.78 while the mean for whites is 53.27. Neal and Johnson (1996) and a number of earlier papers have shown that in the NLSY sample of men a substantial part of the race gap in wages is associated with the race gap in AFQT. If pre-market discrimination is an important factor in a gap between the average skills of black and white workers, then it seems likely that various forms of current labor market discrimination contribute to race differences in wages that are unrelated to skill. However, it is nevertheless interesting to examine the possibility that a correlation between race and skill might lead a rational, profit maximizing employer to use race as a cheap source of information about skills and statistically discriminate on the basis of race. Such statistical discrimination along racial lines can have very negative social consequences and is against the law. However, such discrimination would be difficult to detect. A statistically discriminating firm might use race, along with education and other information to predict the productivity of new workers. With time, the productivity of the worker would become apparent and compensation would be based on the larger information available rather than the limited information available at the time of hire. Consequently, if statistical discrimination on the basis of race is important, then adding interactions between t and z variables such as AFQT and 29 father's education to the wage equations should lead to a positive (or less negative) coefficient on black ><t and should lead to an increase in the race intercept. As noted in section 2, if firms use race as information then it behaves as an s variable in the model and the logic is the same as in our analysis of the effect of education. On the other hand, if firms do not use or only partially use race as information, then a race indicator behaves as a z variable. As discussed in Section 2, in this case the race gap should widen with experience if race is negatively related to productivity, and adding a second z variable that is negatively related to race will reduce the race gap in experience slopes.27 The race differential in our basic specification in column 1 of Table 1 is -.1801. This drops to -.0969 when AFQT, F_ED, and education*t are added to the equation (column 5). When Black*t/10 is added in column 6, it enters with a coefficient of -. 1456 (.0216). This coefficient is consistent with the hypothesis of no or very limited statistical discrimination on the basis of race and inconsistent with the hypothesis that firms make full use of race as information. The coefficient on Black is insignificantly different from 0, although the models do not provide a clear prediction about the sign of this variable, since race may be correlated with information in q that can legally be used. The fact that coefficient on Blackxt/10 rises to -.0816 when F_EDxt and AFQTxt are added to the equation (column 8) is not informative about whether or not firms make full use of race as information.28 We obtain similar results using alternative experience measures in Tables 2 and 3. In Table 4, columns 7 and 8 we obtain similar results after controlling for initial 2-digit occupation. We obtain similar results using growth equations in Table 7, which should be robust to the presence of an economy wide time trend affecting the return to education, race, and AFQT. However, in the level equations we find that the results for race are sensitive to treatment of economy wide time trends. When we use potential experience as the measure of t the coefficient on Blackxt declines only slightly ( from -.0146 to -.0144) when we adding time trend interactions involving race and AFQT to the wage level equation corresponding to Table 1, column 7, but the race-experience interaction no longer drops when AFQT and experience is added. (Not reported.) 27The learning model in section 2 implies thatdifferences across groups inthe associationbetween sand the z variable will leadtogroup differences intheb* and bacoefficients. We have notexploredthisempirically. An obstacletodoing so isthatthe resultsmight be sensitiveto the linearityassumptions thatwe have made. 28Japanese and Chinese Americans score higher on aptitude and achievement teststhan whites. Our analysispredicts thatiffirms statisticallydiscriminate on the basis ofraceand ethnic background then the addition ofAFQT and AFQT*t toan equation containing a dummy and experience interactionterm forthesegroups will leadto an increase in the dummy variable and a reduction in the experience interaction. Sample sizesdo not permit an analysisofthesegroups. While one could differentiateamong whites based on ethnicity (seeBoijas (1992), itisnot clearthatthese ethnic differencesare observable toemployers. Our methods could be used to investigate statisticaldiscrimination on thebasis ofattending prestigous colleges orparticularcollege majors. 30 We wish to stress that the simple model of statistical discrimination cannot explain the negative coefficient on Blackxt unless firms do not make full use of race as information. The accumulation of additional information during a career that can legally be used to differentiate among workers is fully consistent with our results. However, there are several other explanations of the race differences in the experience slope in the literature that may be at work here. It is also important to point out that the results for Black and Blackxt alone (i.e., ignoring the behavior coefficients of the coefficients on education and educationxt) are potentially consistent with a story in which firms are fully informed, AFQT is positively associated with on-the-job training, and the race difference in AFQT is partially responsible for a race differential in wage growth. Adding AFQTxt would reduce a negative bias in Blackxt associated with differential training levels. The increase in Blackxt when AFQTxt is added to the model would lead to a fall in the coefficient on Black. As we report below, we obtain qualitatively similar results when we add controls for employer training, but these controls reduce the magnitude of the coefficient on Blackxt and the effect of adding AFQTxt on the coefficient on Blackxt. Another potential test of whether race is used to statistically discriminate or not is to see whether proposition 4 holds either when race is treated as an s variable or when it is treated as a z variable. To do this, we use the model in column 8 of table 1. With race treated as an s variable, we regress the z variables (AFQT, father’s education, and the dummies for not knowing these quantities) on the two s variables. We sum the product of these coefficients and the coefficients on the zxt interactions in the main regression and compare them to the coefficients on the sxt interactions. We can then conduct a joint test of whether these two quantities are equal. For the education interaction the sum of the products equals -.00183 while the model coefficient is -.00301. For the race interaction, the two terms have opposite signs; the sum is .00644 while the model coefficient is -.00816. Not surprisingly, the proposition is soundly rejected. When we treat race as a z variable, we begin our test by regressing the 5 z variables on education, our s variable. Here, we have only one restriction to test. The sum of the products equals -.00215 while the model coefficient equals -.00301. The proposition can be rejected at conventional levels of significance (the P-value is .027) but with corrected standard errors this will probably not be the case. This is a further indication that employers are not treating race as information, or at least not fully.6 6. Models with Training 31 In Table 8 we report estimates of equation (24) along with models that exclude the training variables. In these models we have excluded father's education. In the basic model in column 1 the coefficient on s*t/10 is -.0102. In column 2 we add Tt and to the equation. The variable Tt has the expected negative sign of -. 1044 (.0179), while ST;t has a coefficient o f. 1864 (.0114). The coefficient on s*t/10 falls to -.0346. The coefficient on AFQT falls from .0828 to .0582 while the coefficient on education rises slightly. The substantial negative experience slope on education might be consistent with a human capital story in which knowledge obtained in school depreciates over time unless one receives training. In column (3) AFQTxt/10 enters with a coefficient of .0502 (.0125), and the coefficient on and sxt/10 drops from -.0358 to -.00427. These changes are consistent with employer leaming/statistical discrimination. If we reverse the order in which the variables are added by adding AFQTxt before the training measures, the marginal effect of the training measures on educationxt is much smaller. We have also estimated separate models for blacks and for whites and obtain a similar pattern. In Columns 4-6 we investigate the effect of introducing the training measure on the race gap in wage slopes. The coefficient on blackxt/10 declines from -. 1467 to -. 1048 when we add the training measures. Adding AFQTxt/10 leads to a further decline to -.0777. To reduce the difficulties associated with the lack of data on training in the early years of the study and individual heterogeneity that is correlated with both training and wages, we turn to a first differenced version of (24). In the first difference version the current and lagged values of T, enter. These results are in Table 9. The coefficient on educationxt/10 declines from .0126 (.0094) to .0073 when the training measures are added. The coefficient on Black rises from -.0995 (.0351) to -.0923 (.0353). However, the coefficient on T, is positive while the coefficient on Tt-i is negative. These signs are inconsistent with a simple human capital model but are consistent with an EL-SD model in which training opportunities are given to more productive workers and learning about productivity occurs over time. Adding the training variables to a model that contains AFQT and F_ED has little impact on the coefficients on these variables. (Compare columns 2 and 4.) Imprecision in the training measures may partially explain this fact, but does not provide an explanation for the sign pattern in the training coefficients. The coefficients on sxt and Blackxt decline in absolute value when AFQT and F_ED are added, as is predicted by the EL-SD. Overall, the wage change results are quite consistent with an important role for EL-SD We view the evidence as consistent with a role for both human capital and EL-SD, but cannot make a precise statement about the relative contribution of these factors because, as 32 discussed above, training will be influenced by new information about employee performance and the quality of the training data is suspect. 7. Information Transmission Across Firms: The formal model that we have used to interpret the results assumes that employers have the same information about workers. The results suggest that information about productivity does eventually get reflected in wages. However, they do not identify whether these adjustments occur primarily in the current firm, presumably in response to outside pressure from competitors who have information about the worker, or through moves to other employers with associated wage increases for workers who do not move.29 In this section we briefly examine the issue of information transmission across firms. A number of theoretical papers discuss whether information about productivity will be reflected in promotion paths and wage increases within firms, as well as the strategies firms might use to try to hide information about good workers (e.g. Greenwald (1986), Waldman (1984), Lazear (1986), Gibbons and Katz (1991)).30 Unfortunately, the theory is ambiguous about whether a firm's private information concerning the worker will be reflected in wages offered by that firm to incumbent workers and about the mechanism that induces the firm to adjust wages. In some private information models in which only wages and perhaps position within the firm are observable to outside firms, the employer’s information is not reflected in wages until the worker gets an outside offer. In Waldman (1984) it is reflected in wages after the firm reassigns the worker to a position in which output is more sensitive to ability. In Gibbons and Katz it is reflected in wages if the firm chooses not to lay off the worker. The firm lays off low productivity workers, who are hired by other firms at lower wages. Outside firms infer that the remaining workers are of higher quality, 29 Although we do notknow ofsystematic evidence on this,casual empiricism suggests thatchanges in the legal system have ledsome firms toadopt the explicitpolicy ofnotproviding referencesforformer employees. Also, increasedfiring costsand concern about litigationmay have made employers more reluctanttodischarge workers forpoor performance. Statisticaldiscrimination may become a more seriousproblem ifinformationflows are restricted. This may leadfirms to relatecompensation toperformance more explicitly,with more turnoverbeing a "voluntary" responsetobelow average wage increases. On the otherhand, difference inwages across groups may be attenuatedbecause firms may be reluctant toopen up largewage differentialsbetween persons with similareducation, seniority, and experience. Itispossiblethat thebalance between these two considerations has changed over time. 30None ofthisliteratureconsiders the implications ofthe possibilitythatemployers and co-workers acquire reputations forhow positivethey are in promoting thecareersofindividualsor thatthe incentives ofco-workers and even supervisors tokeep favorable information about a colleague privateor in concealingunfavorable information from associates outside the firm may be quite differentfrom those ofthe employer. These factorswould undermine the case thatfirmswould want toand be abletokeep inside information inside the firm. 33 which forces the employer to raise wages of those who stay with the firm. Both models have the implication that hard to observe variables like AFQT, F_ED, and the wage of an older sibling should be positively related to wage growth if one does not condition on whether a person was laid off or not. This is what we found above. Gibbons and Katz (1991) provide empirical support for the hypothesis that layoffs should be negatively related to wage growth. But there are a number of other reasons why layoffs should be negatively related to wage growth (labor market conditions, lost seniority, for example). To obtain more focused tests, we interact personal characteristics that are hard for employers to observe directly with indicators for layoffs and discharges. The coefficients on these variables should differ from the coefficients on characteristics that affect productivity and are easy for employers to observe, such as years of schooling if ( 1 ) layoffs occur for multiple reasons, some of which have nothing to do with the worker, (2) the probability that a layoff reflects low worker specific productivity relative to the wage is related to z variables, and (3) outside employers have information about the nature of the layoff or obtain information (through references, for example) about productivity. This suggests an equation of the form wt - wt-i = 3o+ Layoff3i + zp2 + z[Layoff,] 03+ z[Layofft]t (34+ other controls. If knowledge acquired by firms is reflected in wages, then p2 should be nonzero, and p3 and p4 should be near zero. If knowledge acquired by firms is not reflected in wages, then p2 should be small and P3 and p4 should be nonzero. Given sample size limitations we have estimated a simplified version of the above equation on the sample of layoffs only, with zfLayoffi ]t excluded: wt - wt.i = (Po + Pi) + z p3+ other controls. Our evidence on whether hard to observe variables such are positively associated with layoff losses is weak at best. In fact, we find that losses are larger for persons with high AFQT. We have not controlled for labor market conditions, and among the sample of layoffs they may be correlated with AFQT. 31 31 We investigatedwhether the finding thatwage losses risewith AFQT isdrivenby a positive correlationbetween AFQT and employment ina white collar, non unionjob, where layoffsareleastlikelytobe influencedby seniority rules. Gibbons and Katz note that layoffsare likelytobe a particularlynegative signalforwhite collarworkers and restricttheir analysis tothem. However, splittingthe leads to an even more negative coefficientforwhite collarworkers than forblue collarworkers. 34 In Table 10 we report estimates of the effect of AFQT and F ED on employer initiated separations. These include layoffs, firings, and plant closings. Our results were not very sensitive to distinguishing among these three types of job loss. We find that AFQT has a weak negative effect on the probability of losing one’s job, even after conditioning on seniority in the firm. However, when seniority is controlled for a swing of two standard deviations in AFQT changes this probability by .02, which is only 1/5 of the mean layoff rate o f. 1. We obtain similar results when the seniority control is dropped. Our results suggest that only a small part of the rise with t in the effect of AFQT on wages operates through an association between AFQT and layoffs and the wage losses experienced by those who are laid off. Lazear (1986) presents a model in which both the current firm and outside firms observe indicators of the productivity of the worker. His model predicts that workers with favorable productivity traits that are hard to observe directly will be more likely to receive outside offers and more likely to quit than workers whose hard to observe characteristics make them less productive. In results not reported we find that F_ED is positively related to the quit rate conditional on education and experience and tenure. AFQT does not have a significant effect. Neither AFQT nor FJED is significantly related to wage growth among those who quit. (Not reported). These results tentatively suggest that information flows in the labor market are sufficient to force a firm to differentiate among workers as the firm obtains better information about their productivity. A careful investigation will require a separate paper..8 8. The Potential for Testing Services to Certify Skill Our estimates provide information about the rate at which employers learn about worker quality. In Altonji and Pierret (1996) we use our empirical estimates to explore the implications of the rate at which employers learn about worker quality for the empirical relevance of the educational screening hypothesis. We show that even if employers learn relatively slowly about the productivity of new workers, the portion of the return to education that could reflect signaling of ability is quite limited. While education may be too expensive to serve as a means for able workers to certify themselves to employers, perhaps other mechanisms could perform this function, at least for some determinants of productivity. Here we point out that interpreting our estimates of the time profile of the effect of AFQT on wages as the result of employer learning implies that high ability workers 35 would have a substantial financial incentive to take the AFQT to differentiate themselves from those who are less able in this dimension. Suppose that a third party were to administer the AFQT and certify the results to outside employers, in much the same way that the Educational Testing Service administers the SAT exams. Using our estimates of the learning profile and assuming that firms know all of the information contained in AFQT by the time experience is 15, we have computed how much a person who believes that he is 1 standard deviation above the mean for the AFQT would pay to take the test at the time he enters the workforce.32 The OLS estimates using potential experience underlying Table 6, panel A, column 3) imply that if firms become fully informed about productivity by the time experience is 15 and the interest rate is . 1, then the person would be willing to pay .559 of the first year's salary for the test.33 The corresponding value when we use potential experience as an instrument for actual experience (panel B, column 3) is .330. These calculations raise the issue of why such a testing service has not emerged if information is initially imperfect. One answer is that firms are not aware that the AFQT captures characteristics that have a strong association with productivity. It is only recently, with the availability of the NLSY, that labor economists have become aware of this. Another is that it would be difficult for a testing firm to become established at a national level. A third is that, given race differences in the distribution of AFQT scores, firms who make use of AFQT information in hiring for a specific job would have the burden of establishing that they are relevant to productivity in that job or run the risk of violating discrimination laws. This would be true even if individuals provided firms with the test results. However, we do not find these answers to be fully satisfactory.34 Analyses based on variables such as the wage rates of siblings or father's education may be less vulnerable to this objection. In any event, we should also point out that our estimates of the AFQTexperience profile are sensitive to treatment of time trends and training, so that financial return to being certified as high AFQT is probably substantially less than the above numbers imply. 32Ifa worker did not know his ability, he could take a practiceteston hisown. Presumably, thiswould not raisethetotal costofthe testvery much. 33Here we are assuming thatonly 1worker takesthe testand ignoring thefactthatthe composition ofthepool ofworkers who choose totake the testin equilibrium would influence returnfora particulartype ofworker. 34 Note also that inthe absence ofan institution such as the Educational Testing Service, a firm might providethetest. Some firms perform theirown testing.. However, ifthe resultswere available tothe employees or otherfirms know thata particularfirm testsitsemployees, then the firm would not be abletocapture the fullreturn totesting. 36 9. Conclusion This paper explores the implication of the premise that firms use the information they have available to them to form judgments or about the productivity of workers and then revise these beliefs as additional information becomes available. This a premise that seems natural to us and receives some strong empirical support in Farber and Gibbons (1996). If profit maximizing firms have limited information about the general productivity of new workers, then they may use easily observable characteristics such as years of education or race to statistically discriminate among workers. We show that as firms acquire more information about a worker, pay may become more dependent on productivity and less dependent on easily observable characteristics or credentials. This basic idea is quite general and provides a way to test for statistical discrimination in the labor market and elsewhere in situations in which agents learn, such as credit markets. We investigate it empirically by estimating a wage equation that contains interactions between experience and hard to observe characteristics such as AFQT and father's education along with the interaction between experience and a variable that firms can easily observe, such as years of education. We assume that all three variables are related to productivity. We find the wage effect of the unobservable productivity variables rise with time in the labor market and the wage effect of education falls. These results match the predictions of our model of statistical discrimination with learning. We use a similar methodology to investigate whether employers statistically discriminate on the basis of race. If our model is taken literally, the small race differentials for new workers and the spread in the race gap with experience is most consistent with the view that race is negatively correlated with productivity and the productivity gap becomes reflected in wages as firms acquire additional information that can legally be used to differentiate among workers. We wish to stress however, that other factors are probably as or more important in differences between whites and blacks in wage profiles, and race differences in human capital accumulation accounts for at least part of our findings. Also, our empirical results for race are sensitive to treatment of economy wide changes in the effects of race, AFQT, and education. Future research should also address the large race gap and education gap in employment rates, particularly for young workers. In situations in which there are alternatives to the conventional labor market and employees in the alternative sector do not acquire work histories that have value or are informative to firms in the conventional sector, 37 then statistical discrimination of the type described above may reduce participation rates of the disadvantaged group in the conventional labor market. It is worth emphasizing that the analysis in the paper suggests alternative interpretations of empirical models of wages and other outcomes that involve experience interactions. It will be useful to re-examine the results of other studies that included interactions between experience and easy to observe variables such as schooling, race, gender, and experience in equations that also contain interactions between experience and harder to observe background measures. We have not been successful in sorting out the relative importance of differences among workers in training on one hand and statistical discrimination with learning on the other for our results. This is an important area for future research. An important and reasonably straightforward extension of the analysis is to other easily observable and hard to observable background characteristics. For example, do firms statistically discriminate on the basis of the neighborhood one is from or on the basis of the reputation of the high school, college, or graduate school one attends? A study of whether new immigrants are judged by the average skills of their countrymen in the U.S. would be a natural step in the research by Boijas (1992) and others documenting differences among immigrants in labor market success. These issues are researchable using the approach developed in this paper. Finally, it would be useful to apply the methods of the paper to other labor market outcomes in addition to wages. 38 References D. Aigner and G. Cain (1977), “Statistical Theories of Discrimination” Industrial and Labor Relations Review. Albrecht, J. (1981), “A Procedure for Testing the Signalling Hypothesis.” Journal of Public Economics, pp. 123-32. Altonji, J. G. and C. R. Pierret (1997), “Employer Learning and the Signaling Value of Education,” in Ohashi, Isao and Tachibanaki, Toshiaki, (ed.) Internal Labour Markets. Incentives and Employment. Macmillan Press Ltd. Altonji, J.G. and J. Spletzer (1991), “Worker Characteristics, Job Characteristics, and the Receipt of On-the-Job Training.” Industrial and Labor Relations Review. 45(1) pp. 58-79. Bartel, A. P., andN. Sicherman (1993), “Technological Change and On-The-Job Training of Young Workers” unpublished paper, Columbia University. Becker, G. S. (1971), The Economics of Discrimination. 2nd ed., Chicago: University of Chicago Press. Boijas, G. (1992), “Ethnic Capital and Intergenerational Mobility” Quarterly Journal of Economics ,117(1) pp.123-150. Carmichael, H. L. (1989). “Self-Enforcing Contracts, Shirking, and Life Cycle Incentives” The Journal of Economic Perspectives 3 (Fall): 65-84. Chay, Kenneth Y. and David S. Lee, (1997) “Changes in Relative Wages in the 1980s: Returns to Observed and Unobserved Skills and Black-White Wage Differentials”, unpublished paper. February. Coate, S. and G. Loury (1993) “Will Affirmative Action Policies Eliminate Negative Stereotypes?” C, 83 (5), pp. 1220-1240 Devine, T.J., and Kiefer, N.M. (1991). Empirical Labor Economics (New York: Oxford University Press). Farber, H., and R. Gibbons, (1996) “Learning and Wage Dynamics” Quarterly Journal of Economics, pp. 1007-47. Frazis, H., (1993), “Selection Bias and the Degree Effect,” Journal of Human Resources 28 (3): (Summer 1993): 538-554. Gibbons, R., and L. Katz (1991) “Layoffs and Lemons” Journal of Labor Economics 9: 351-80. Greenwald, B., (1986) “Adverse Selection in the Labor Market” Review of Economic Studies 53: 325-47 39 Foster, A. D. and M. R. Rosenzweig (1993), “Information, Learning, and Wage Rates in Low Income Rural Areas,” Journal of Human Resources. Holzer, H., (1988) “Search Methods Use by Unemployed Youth” Journal of Labor Economics 6: 1-20. Jovanovic, B., (1979) “Job Matching and the Theory of Turnover” Journal of Political Economy 87: 972-90. Katz, Lawrence F. and Kevin M. Murphy. “Changes in Relative Wages, 1963-1987: Supply and Demand Factors." Quarterly Journal of Economics. Vol 107:1, pp3 5-78. Lang K., (1986) “A Language Theory of Discrimination”, Quarterly Journal of Economics. 101 (May): 363-82 Lazear, E., (1986) “Raids and Offer Matching,” Research in Labor Economics 8 (part A): 141-165 Lundberg, S., and R. Startz, “Private Discrimination and Social Intervention in Competitive Labor Markets,” American Economic Review. 73 (June): 340-347. L. Lynch, (1992) “Private Sector Training and the Earnings of Young Workers,” American Economic Review 82 (March):299-312 Medoff, J. and K. Abraham. (1980), “Experience, Performance and Earnings.” Quarterly Journal of Economics. December, pp. 703-736. Montgomery, J. (1991). “Social Networks and Labor Market Outcomes: Toward and Economic Analysis”. American Economic Review 81: 1408-18. Mumane, Richard J., John B. Willett and Frank Levy, (1995) “The Growing Importance of Cognitive Skills in Wage Determination “, Review of Economics and Statistics, vol. lxxvii, no. 2, (May): 251-266. Murphy, K. and F. Welch, (1992) “The Structure of Wages”, Quarterly Journal of Economics. 107 (February): 285-326. Neal, Derek A. and William R. Johnson 1996 “The Role of Premarket Factors in black-White Wage Differences.” Journal of Political Economy 104 (5), 869-895. Oettinger, G. S. (1996), “Statistical Discrimination and the Early Career Evolution of the BlackWhite Wage Gap”. Journal of Labor Economics . Parsons, D. O., (1986) “The Employment Relationship: Job Attachment, Worker Effort, and the Nature of Contracts” Ashenfelter and Layard, eds. Handbook of Labor Economics Parsons, D. O., “Reputational Bonding of Job Performance: The Wage Consequences of Being Fired” unpublished paper. (July 1993) 40 Riley, J. G. (1979) “Testing the Educational Screening Hypothesis,” Journal of Political Economy. October, pp. S227-52. Taber, C. R., (1996) “The Rising College Premium in the Eighties: Return to College or Return to Ability,” unpublished paper, Northwestern University (April). Spletzer, J. and M. Lowenstein (1996), “Belated Training: The Relationship Between Training, Tenure and Wages” unpublished paper, Bureau of Labor Statistics. Waldman, M. (1984), “Job Assignment, Signaling, and Efficiency,” Rand Journal of Economics. 15: 255-67. Weiss, A. (1995), “Human Capital and Sorting Models,” Journal of Economic Perspectives. White, H. (1984). Asymptotic Theory for Econometricians, Orlando, FL: Academic Press 42 Table 1: The Effects of Standardized AFQT, Father's Education, and Schooling on Wages Dependent Variable: Log Wage. Experience Measure: Potential Experience. OLS estimates (standard errors) Model: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (a) Education 0.0946 (0.0034) 0.0742 (0.0039) 0.0729 (0.0040) 0.1001 (0.0051) 0.0798 (0.0054) 0.0984 (0.0057) 0.0818 (0.0054) 0.0949 (0.0057) 0.0788 (0.0064) 0.0855 (0.0064) (b) Black -0.1801 (0.0117) -0.1039 (0.0138) -0.0974 (0.0141) -0.1799 (0.0117) -0.0969 (0.0141) -0.0956 (0.0142) 0.0153 (0.0203) -0.0330 (0.0226) -0.0948 (0.0142) -0.0945 (0.0141) 0.0807 (0.0077) 0.0783 (0.0078) 0.0785 (0.0078) 0.0179 (0.0107) 0.0790 (0.0077) 0.0328 (0.0115) 0.0187 (0.0107) -0.0028 (0.0111) 0.0263 (0.0192) 0.0259 (0.0192) -0.0015 (0.0031) 0.0028 (0.0019) -0.0062 (0.0308) -0.0163 (0.0308) -0.0286 (0.0318) -0.0098 (0.0061) -0.0351 (0.0069) -0.0122 (0.0061) -0.0301 (0.0071) -0.0510 (0.0087) -0.0361 (0.0105) (c) Standardized AFQT (d) Father's Education/10 (e) Education * Experience/10 -0.0077 (0.0062) AFQT * Experience/10 0.0820 (0.0125) 0.0622 (0.0143) 0.0817 (0.0125) 0.0316 (0.0241) (g) Father'sEd * Experience/100 0.0592 (0.0372) 0.0481 (0.0372) 0.0611 (0.0371) 0.0392 (0.0667) (f) (h) Black * Experience/10 -0.1456 (0.0216) -0.0816 (0.0262) Note: All equations controlfora quadratictime trend, uiban residence, and dummy variablestocontrol forwhether Father’seducation ismissing and whether AFQT is missing, and interactionsbetween these dummy variablesand experience when Experience interactionsare included. Column 9 includes the interactionbetween education and time/10 (theestimate is.0349 (.0078)). Column 10 includes interactionsofeducation (.0142(.0101)), AFQT (.0688(.0228)), and Father’sEducation/10 (.0317(.0631))with time/10. Standard errorsare White/Huber standard errorscomputed accounting forthe factthatthere are multiple observations foreach worker. The sample size is27704 observations from 4042 individuals. 43 Table 2: The Effects of Standardized AFQT, Father's Education, and Schooling on Wages Dependent Variable: Log Wage. Experience Measure: Actual Experience. OLS estimates (standard errors) Model: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (a) Education 0.0805 (0.0026) 0.0613 (0.0031) 0.0599 (0.0032) 0.0713 (0.0033) 0.0504 (0.0038) 0.0628 (0.0040) 0.0524 (0.0038) 0.0614 (0.0040) 0.0385 (0.0059) 0.0485 (0.0064) (b) Black -0.1378 (0.0113) -0.0673 (0.0132) -0.0624 (0.0134) -0.1381 (0.0113) -0.0625 (0.0134) -0.0622 (0.0135) -0.0025 (0.0152) -0.0346 (0.0159) -0.0608 (0.0135) -0.0602 (0.0135) 0.0754 (0.0073) 0.0730 (0.0075) 0.0731 (0.0075) 0.0366 (0.0082) 0.0726 (0.0075) 0.0430 (0.0084) 0.0373 (0.0082) 0.0041 (0.0120) 0.0324 (0.0186) 0.0321 (0,0187) 0.0005 (0.0022) 0.0032 (0.0019) 0.0085 (0.0224) 0.0042 (0.0223) -0.0127 (0.0345) 0.0200 (0.0054) -0.0055 (0.0066) 0.0163 (0.0054) -0.0025 (0.0068) -0.0320 (0.0099) -0.0165 (0.0114) (c) Standardized AFQT (d) Father's Education/10 (e) Education * Experience/10 (f) AFQT* Experience/10 0.0750 (0.0131) 0.0614 (0.0148) 0.0737 (0.0131) 0.0226 (0.0240) (g) Father's Ed * Experience/100 0.0587 (0.0367) 0.0502 (0.0370) 0.0613 (0.0365) 0.0362 (0.0678) (h) Black * Experience/10 0.0195 (0.0055) -0.1267 (0.0233) -0.0583 (0.0280) Note: All equations control fora quadratictime trend,urban residence, and dummy variablestocontrol forwhether Father'seducation ismissing and whether AFQT is missing, and interactionsbetween thesedummy variablesand experience when Experience interactionsare included. Column 9 includes the interactionbetween educationand time/10 (theestimate is.0402 (.0085)). Column 10 includes interactionsofeducation (.0195(.0104)), AFQT (,0684(.0211)),and Father’sEducation/10 (,0333(.0623)) with time/10. Standard errorsare White/Hubcr standarderrorscomputed accounting forthe factthatthereare multiple observations foreach worker. The sample size is27704 observationsfrom 4042 individuals. 44 Table 3: IV Estimates of the Effects of Standardized AFQT, Father's Education, and Schooling on Wages Dependent Variable: Log Wage. Experience Measure: Actual Experience with Potential Experience as Instruments . IV estimates (standard errors) Model: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (a) Education 0.0813 (0.0028) 0.0620 . (0.0034) 0.0606 (0.0035) 0.0891 (0.0050) 0.0692 (0.0054) 0.0879 (0.0056) 0.0726 (0.0054) 0.0843 (0.0057) 0.0468 (0.0065) 0.0517 (0.0069) (b) Black -0.1368 (0.0116) -0.0650 (0.0132) -0.0601 (0.0135) -0.1367 (0.0116) -0.0600 (0.0136) -0.0593 (0.0136) 0.0495 (0.0186) 0.0054 (0.0205) -0.0531 (0.0138) -0.0527 (0.0138) 0.0762 (0.0074) 0.0737 (0.0075) 0.0738 (0.0076) 0.0177 (0.0096) 0.0728 (0.0075) 0.0332 (0.0102) 0.0218 (0.0096) 0.0005 (0.0127) 0.0337 (0.0188) 0.0340 (0.0188) 0.0000 (0.0028) 0.0033 (0.0019) 0.0091 (0.0282) -0.0043 (0.0278) 0.0111 (0.0363) -0.0181 (0.0087) -0.0561 (0.0100) -0.0242 (0.0087) -0.0483 (0.0101) -0.1220 (0.0188) -0.1090 (0.0221) (c) Standardized AFQT (d) Father's Education/10 (e) Education * Experience/10 -0.0165 (0.0088) AFQT* Experience/10 0.1148 (0.0164) 0.0819 (0.0188) 0.1056 (0.0163) 0.0539 (0.0399) (g) Father'sEd * Experience/100 0.0744 (0.0480) 0.0531 (0.0484) 0.0877 (0.0478) 0.1219 (0.1124) (f) (h) Black* Experience/10 -0.2305 (0.0318) -0.1364 (0.0387) Note: All equations controlfora quadratictime trend, urban residence, and dummy variablestocontrol forwhether Father's education ismissing and whether AFQT is missing, and interactionsbetween thesedummy variables and experience when Experience interactionsare included. The instrumental variables arethecorresponding terms involving potentialexperience and theothervariables inthe model. Column 9 includes the interactionbetween education and time/10 (theestimate is 0.0803 (.0135)). Column 10 includes interactionsofeducation (.0670(.0166)), AFQT (.0546(.0311)), and Father’sEducation/10 (-.0376(.0882)) with time/10. Standard errors are White/Huber standard errorscomputed accounting forthefactthatthere aremultiple observationsforeach worker. The sample size is27704 observations from 4042 individuals. 45 Table 4: Estimates of the Effects of Standardized AFQT, Father's Education, and Schooling on Wages Controlling for 2-digit Occupation Codes of Initial Job Dependent Variable: Log Wage. Experience Measure: Potential Experience. OLS estimates (standard errors) Model: a) (2) (3) (4) (5) (6) (7) (8) (9) (10) (a) Education 0.0759 (0.0042) 0.0611 (0.0045) 0.0596 (0.0046) 0.0717 (0.0058) 0.0572 (0.0061) 0.0767 (0.0064) 0.0592 (0.0061) 0.0745 (0.0064) 0.0666 (0.0067) 0.0709 (0.0068) (b) Black -0.1539 (0.0131) -0.0917 (0.0154) -0.0829 (0.0156) -0.1539 (0.0131) -0.0830 (0.0156) -0.0812 (0.0156) 0.0190 (0.0209) -0.0413 (0.0222) -0.0809 (0.0157) -0.0809 (0.0156) 0.0662 (0.0085) 0.0635 (0.0086) 0.0634 (0.0086) -0.0036 (0.0111) 0.0638 (0.0086) 0.0061 (0.0117) -0.0030 (0.0111) -0.0151 (0.0119) 0.0298 (0.0206) 0.0299 (0.0207) -0.0049 (0.0315) 0.0310 (0.0206) 0.0010 (0.0316) -0:0051 (0.0316) -0.0188 (0.0342) 0.0032 (0.0065) -0.0245 (0.0075) 0.0008 (0.0065) -0.0212 (0.0078) -0.0354 (0.0096) -0.0254 (0.0119) AFQT* Experience/10 0.0940 (0.0140) ■ 0.0807 (0.0159) 0.0940 (0.0140) 0.0626 (0.0280) (g) Father'sEd * Experience/100 0.0532 (0.0411) 0.0447 (0.0414) 0.0530 (0.0411) 0.0229 (0.0752) (c) Standardized AFQT (d) Father's Education/10 (e) Education * Experience/10 (f) (h) Black * Experience/10 0.0057 (0.0066) -0.1377 (0.0233) -0.0542 (0.0281) Note: All equations controlfora quadratictime trend,urban residence, and dummy variablestocontrol forwhetherFather'seducation ismissing and whether AFQT is missing, and interactionsbetween thesedummy variablesand experience when Experience interactionsare included. Column 9 includes the interactionbetween education and time/10 (theestimate is.0206 (.0083)). Column 10 includes interactionsofeducation (.0072(.0111)), AFQT (,0409(.0263)), and Father’sEducation/10 (,0405(.0697)) with time/10. Standard errorsare White/Huber standard errorscomputed accounting forthe factthatthereare multiple observations foreach worker. The Sample Size is22271 observationsfrom 3187 individuals. 46 Table 5: OLS Estimates of the Effects of Sibling Wage and Schooling on Wages Dependent Variable: Log Wage. Experience Measure: Potential Experience OLS estimates (standard errors) Model: (1) (a) Education 0.0936 (0.0055) 0.0830 (0.0055) 0.1032 (0.0077) 0.0900 (0.0077) 0.0938 (0.0078) 0.0803 (0.0089) 0.0805 (0.0089) (b) Black -0.1932 (0.0164) -0.1620 (0.0163) -0.1932 (0.0164) -0.1621 (0.0163) -0.1620 (0.0163) -0.1619 (0.0163) -0.1619 (0.0163) (c) Log Wage of Oldest Non-Missing Sibling 0.1876 (0.0191) 0.1873 (0.0191) 0.1266 (0.0276) 0.1264 (0.0276) 0.1230 (0.0323) (d) Sibling is Female 0.0205 (0.0155) 0.0208 (0.0155) 0.0211 (0.0155) 0.0214 (0.0155) 0.0213 (0.0155) (e) Education * Experience/10 -0.0097 (0.0089) -0.0146 (0.0090) -0.0220 (0.0113) -0.0216 (0.0120) (f) Log of Sibling Wage * Experience/10 0.0860 (0.0327) 0.0862 (0.0326) 0.0802 (0.0679) (2) (3) -0.0133 (0.0091) (4) (5) (6) (7) Note: All equations controlfora quadratictime trend,and urban residence. Column 6 includesthe interactionbetween education and time/10 (theestimateis (,0204(.0113)). Column 7 includes interactionsofeducation (,0199(.0122)), and Log SiblingWage (,0085(.0667)) with time/10. Standard errorsare White/Huber standarderrorscomputed accounting forthe factthatthereare multiple observationsforeach worker. The Sample Size is 13,555 observationsfrom 1881 individuals. 47 Table 6: The Effectsof Standardized AFQT, and Schooling on Wages Over Time. Derivatives at Selected Experience Levels Dependent Variable: Log Wage A) PotentialExperience. Years of Experience 9wt 9AFQT 9w2 SAFQT.dt 9wt 9s 9wt 2 9s,dt 0 0.0197 (0.0235) 0.0025 (0.0139) 0.0786 (0.0092) 0.0053 (0.0040) 1 0.0235 (0.0275) 0.0049 (0.0144) 0.0830 (0.0101) 0.0034 (0.0040) 3 0.0370 (0.0347) 0.0084 (0.0155) 0.0865 (0.0116) 0.0002 (0.0042) 5 0.0560 (0.0415) 0.0104 (0.0166) 0.0843 (0.0131) -0.0023 (0.0043) 8 0.0881 (0.0512) 0.0104 (0.0181) 0.0731 (0.0152) -0.0048 (0.0046) 12 0.1206 (0.0640) 0.0048 (0.0201) 0.0513 (0.0179) -0.0056 (0.0048) B) Actual Experience Instrumented with PotentialExperience. Years of Experience dwt 9wJ 5AFQT SAFQT.a 9wt 9s 9w2 9s,9t 0 0.0183 (0.0205) 0.0092 (0.0231) 0.0881 (0.0105) -0.0024 (0.0086) 1 0.0278 (0.0316) 0.0099 (0.0251) 0.0843 (0.0137) -0.0051 (0.0089) 3 0.0496 (0.0496) 0.0120 (0.0288) 0.0711 (0.0190) -0.0074 (0.0096) 5 0.0755 (0.0658) 0.0138 (0.0323) 0.0566 (0.0236) -0.0068 (0.0102) 8 0.1172 (0.0893) 0.0131 (0.0373) 0.0406 (0.0300) -0.0037 (0.0112) 12 0.1475 (0.1206) -0.0012 (0.0436) 0.0299 (0.0381) -0.0032 (0.0123) The equations the same variables as the equation in column (6) oftable 1except the interactionsbetween education and experience and between AFQT and experience involvefourth-orderpolynomials in experience. In panel B, the instrumental variables are the corresponding terms involving potential experience and the other variables in the model. 48 Table7: Estimates of the Effects of AFQT, Father's Education, and Schooling on Wage Growth Dependent Variable: A log Wage. Coefficient Estimates (standard errors) OLS, potential experience Variable IV, actual experience treated as endogenous 0) (2) (3) (4) (5) (6) (7) (8) 0.0148 (0.0094) -0.0092 (0.0110) -0.0080 (0.0113) 0.0126 (0.0094) 0.0295 (0.0079) -0.0030 (0.0100) 0.0021 (0.0101) 0.0256 (0.0080) AFQT* AExperience 0.0646 (0.0192) 0.0595 (0.0210) 0.0905 (0.0197) 0.0700 (0.0213) Father's Education * AExperience 0.0809 (0.0557) 0.0776 (0.0563) 0.0952 (0.0561) 0.0818 (0.0567) Education * AExperience Black * AExperience S.E.E .29655 .29650 -0.0213 (0.0409) -0.0995 (0.0351) .29650 .29653 .29600 .29589 -0.0850 (0.0420) -0.1768 (0.0372) .29588 .29590 Note: All equationscontrol fora the change ina quadratictime trend, change in urban residence, and dummy variables tocontrol forwhether father’seducation ismissing and whether AFQT ismissing, and interactionsbetween these dummy variablesand thechange in experience when change in experience interactions are included. The instrumentalvariables are thecorresponding terms involving potential experience and theothervariables inthe model. Standard errorsare White/Huber standard errorscomputed accountingforthe factthatthereare multiple observations foreach worker. The sample sizeis 19393 observationsfrom 3580 individuals. 48 49 Table 8: The Effects of Standardized AFQT, Schooling, and Training on Wages Dependent Variable: Log Wage ; Experience Measure: Potential Experience Training Measure: Predicted before 88, Actual After OLS estimates (standard errors) Model: (1) (2) (3) (4) (5) (6) (a) Education 0.0808 (0.0054) 0.0856 (0.0055) 0.0951 (0.0057) 0.0830 (0.0054) 0.0869 (0.0055) 0.0921 (0.0058) (b) Black -0.1008 (0.0142) -0.0920 (0.0143) -0.0916 (0.0143) 0.0117 (0.0206) -0.0131 (0.0203) -0.0332 (0.0221) (c) Standardized AFQT 0.0822 (0.0078) 0.0572 (0.0079) 0.0218 (0.0104) 0.0828 (0.0078) 0.0582 (0.0078) 0.0376 (0.0114) (e) Education * Experience/10 -0.0102 (0.0062) -0.0346 (0.0066) -0.0472 (0.0073) -0.0129 (0.0062) -0.0358 (0.0066) -0.0427 (0.0075) (f) AFQT* Experience/10 (8) Black * Experience/10 (h) Training: Tt -0.1044 (0.0179) (i) Cumulative Training: ET, 0.1864 (0.0114) 0.0502 (0.0125) 0.0288 (0.0149) -0.1467 (0.0221) -0.1048 (0.0222) -0.0777 (0.0266) -0.0936 (0.0180) -0.0974 (0.0179) -0.0930 (0.0180) 0.1781 (0.0116) 0.1810 (0.0114) 0.1776 (0.0116) Note: All equations control fora quadratictime trend, urban residence, a cubic inpotentialexperience. In thistable,Ttand Ttare thepredicted probability oftraining inyear tifbefore 1987 and actualtrainingifyear tisafter 1987. Predictions arebased on a probit model containing: years ofschooling, potentialexperience, Black, AFQTPCT, schooling time potentialexperience and potentialexperience squared, AFQT times potential experience and potential experience squared, and theproduct ofAFQTPCT, schooling, and potentialexperience. Standard errorsare White/Huber standard errorscomputed accountingforthefactthatthere are multipleobservationsforeach worker. The sample sizeis25115 observations from 3768 individuals. 49 50 Table 9: Estimates of the Effects of AFQT, Father's Education, and Schooling on Wage Growth with Controls for Training Dependent Variable: A log Wage. Experience Measure: Potential Experience Coefficient Estimates (standard errors) Variable Education * AExperience/10 (1) (2) (3) 0.0126 (0.0094) -0.0080 (0.0113) 0.0073 (0.0096) (4) -0.0108 (0.0113) AFQT* AExperience/10 0.0595 (0.0210) 0.0533 (0.0211) Father's Education * AExperience/10 0.0078 (0.0056) 0.0075 (0.0056) -0.0923 (0.0353) -0.0215 (0.0408) Lagged Training lagged T / 10 -0.0109 (0.0950) -0.0336 (0.0951) Training: T / 10 0.2622 (0.0891) 0.2446 (0.0894) .29649 .29647 Black * AExperience/10 SE E -0.0995 (0.0351) -0.0213 (0.0409) .29653 .29650 Note: All equations controlfora thechange in a quadratictime trend, change inurban residence, and dummy variablestocontrol forwhether father'seducation ismissing and whether AFQT ismissing, and interactionsbetween these dummy variablesand the change inexperience when change in experience interactions areincluded. Standard errorsareWhite/Huber standard errorscomputed accountingforthefactthatthere are multiple observationsforeach worker. The sample size is 19393 observationsfrom 3580 individuals. 50 Table 10: The Effects of Potential Experience, Standardized AFQT, Fathers Education, and Schooling on the Probability of Employer-Initiated Separation Linear Probability Models Dependent Variable: Employer-Initiated Separation. OLS estimates (standard errors) Model: (1) (a) Potential Experience / 10 -0.0302 (0.0194) -0.1646 (0.0540) (b) Potential Experience Squared /100 -0.0143 (0.0113) 0.0148 (0.0129) (c) Tenure -0.0141 (0.0006) -0.0146 (0.0006) (d) Education -0.0153 (0.0012) -0.0206 (0.0027) (e) Black 0.0272 (0.0057) 0.0265 (0.0057) (f) Standardized AFQT -0.0108 (0.0029) -0.0251 (0.0061) (g) Father's Education / 100 0.0303 (0.0701) 0.0991 (0.1532) (h) Education * Experience /10 0.0083 (0.0033) (0 AFQT* Experience /10 0.0188 (0.0065) (i) Father'sEd * Experience /1000 -0.0910 (0.1738) (2) Note: An Employer-Initiated Separation includes separationsbecause oflayoffs,firings, and plant closings. All equations control forurban residence, and dummy variables to controlforwhether Father's education ismissing and whether AFQT is missing, and interactionsbetween these dummy variablesand experience when Experience interactionsare included. Standard errors are White/Huber standard errorscomputed accounting forthe factthatthere are multiple observations foreach worker. The sample sizeis27443 observations from 4034 individuals. 'i 2 Table Al: Descriptive Statistics Variable Mean Standard Deviation Minimum Maximum Real Hourly Wage 8.370 4.766 2.01 96.46 Log of Real Hourly Wage (w) 2.005 0.474 0.7 4.57 Potential Experience (t) 7.349 3.665 0 21 Actual Experience (t) 4.925 3.424 0 18.26 12.699 2.136 8 18 Black dummy (Black) 0.290 0.454 0 1 Dummy for not knowing AFQT Score 0.038 0.191 0 1 -0.133 1.022 -2.780 1.922 0.119 0.324 0 1 11.709 3.112 4 20 0.781 0.413 0 1 86.623 81.558 79 92 Training (Tt) 0.096 0.200 0 1 Cumulative Training: (Z Tx) 0.462 0.549 0 5.592 Education (s) Standardized AFQT Score (AFQT) Dummy for not knowing Father's Education Father's Education (F ED) Dummy for Urban Dweller Year Sample size = 27,704 observations except for the training measures where it is 25,115 observations. r S3 Table A2: Relationships Among Wages, Schooling, AFQT, and Parental Education Simple Regression Coefficients (standard error) and [Correlation coefficient] Dependent Variable Right Hand Side Variable Log Wage Highest Grade Father’s Standard. Education AFQT 0.6197 (0.0098) [0.4029] Weeks of Company Training Layoff Quit 0.2747 (0. 0027) [0.5829] 0.1189 (0.0163) [0.0514] -0.0193 (0.0010) [-0.1259] -0.0128 (0.0014) [-0.0589] -0.0823 (0.0103) [-0.0329] -0.4831 (0.0106) [-0.2923] 0.1341 (0.0019) [0.4362] 0.0621 (0.0106) [0.0392] -0.0059 (0.0007) [-0.0542] 0.0014 (0.0009) [0.0112] -0.0323 (0.0067) [-0.0331] -0.1660 (0.0071) [-0.1538] 0.3072 (0.0345) [0.0645] -0.0377 (0.0021) [-0.1174] -0.0121 (0.0029) [-0.0306] -0.1036 (0.0218) [-0.0142] -0.8138 (0.0227) [-0.2329] -0.0011 (0.0004) [-0.0130] -0.0017 (0.0006) [-0.0190] -0.0173 (0.0044) [-0.0282] -0.0297 (0.0047) [-0.0453] -0.2707 (0.0092) [-0.2080] -1.1766 (0.0696) [-0.1391] -0.3223 (0.0753) [-0.0629] -1.2747 (0.0510) [-0.1658] -0.8834 (0.0553) [-0.1070] Highest Grade 0.0785 (0.0014) [0.3615] Father’s Education 0.0298 (0.0010) [0.2092] 0.2592 (0.0041) [0.4029] Standardized AFQT 0.1565 (0.0031) [0.3567] 1.2245 (0.0119) [0.5829] 1.4280 (0.0204) [0.4362] Weeks of Company Training 0.0045 (0.0007) [0.0429] 0.0214 (0.0029) [0.0514] 0.0268 (0.0046) [0.0392] 0.0124 (0.0014) [0.0645] Layoff -0.1659 (0.0104) [-0.1094] -0.8921 (0.0468) [-0.1259] -0.6558 (0.0728) [-0.0542] -0.3904 (0.0222) [-0.1174] -0.2702 (0.1112) [-0.0130] Quit -0.2145 (0.0076) [-0.1909] -0.3232 (0.0348) [-0.0589] 0.0814 (0.0539) [0.0112] -0.0683 (0.0165) [-0.0306] -0.2321 (0.0821) [-0.0190] -0.1478 (0.0050) [-0.2080] Actual Experience 0.0444 (0.0010) [0.2893] -0.0374 (0.0047) [-0.0329] -0.0350 (0.0072) [-0.0331] -0.0106 (0.0022) [-0.0142] -0.0436 (0.0110) [-0.0282] -0.0116 (0.0007) [-0.1391] -0.0230 (0.0009) [-0.1658] Potential Experience 0.0174 (0.0010) [0.1044] -0.1899 (0.0042) [-0.2923] -0.1561 -0.0718 (0.0066) (0.0020) [-0.1538] [-0. 2329] -0.0647 (0.0103) [-0.0453] -0.0027 (0.0006) [-0.0629] -0.0138 (0.0009) [-0.1070] Potential Actual Experience Experience 0.8605 (0.0045) [0.7953] 0.7452 (0.0039) [0.7953] 54 Appendix 1 „ ^ , .* A .v a r(v )+ co v (v ,e ) From equation (11) we have = -cov ( s ,z ) * -- ; ---:— --- and \var(s,z)\ = _ v a i(s ) , We ^ \ v a r ( y ) + c o v ( v ,e) ,ha, . “ \var(s,z)\ c o v (s ,z ) . This gives us the desired result: v a r(s) O,s = -<&zs<i>z Appendix 2: Derivation of Equation (16) and (17). Consider equation ( 15). Rewriting varfoz)'1as a partitioned matrix leads to var(s ,z ) -1 var(s) cov (s ,z ) _cov (z ,s ) var(z)_ 1 = where var(s,z) isthe (K+l)x(K+l) variance matrix. Using the partitioned inverse formula and ignoring the first column (since itwill be multiplied by 0), we have: (15a> E ‘b*' _ -b*. bso -cov(s,z)*G + *[cov(z,E(Av + e|D t))] var(s)*G_ _bzo_ -l-i where G = [varfi)*va.r(z)~ co v (z , s) *cov(i,zj] Now, consider the diagonal matrix K which has elements of cov(z, Av+e) along the diagonal. K'1is also diagonal. Thus (15a) may be rewritten as: (15b) E bst _ bso bn. bzo_ cov ( s ,z ) * G + ‘K * K ~ l * [cov(z,E(v+ e|D t )j] var(s)*G_ Manipulating this fiuther gives us: (15c) E b« _ _b*. bso jbzo_ -cov ( s ,z ) * G + *0' *[cov(z,Av+e)] var(i)*G where 0 ‘is s diagonal matrix with element kk equal to 0J* = c-v-^Zk’^ v cov(Zk,v+ e) of Qa and evaluating the above equation leads to (16) and (17) in the text. 54 Using the definition 55 Appendix 3 The regression parameters Z>*(and where are isthe coefficient on zi inthe regression of Av+e on s and ziand d>2 isthe coefficient on z x inthe regression of Av+e on s,z\, and Z2. By the omitted variables formula, we know that <1>* = <f>2[+ <J>Zj<I>Vi where <E>Zjisthe coefficient on z2 inthe regression of Av+e on s, z, and z2 and isthe coefficient on zi inthe regression ofz2 on zi and s. Therefore, a = <!> - ^ = < d; aot a and ae, 09, a Taking the difference establishes that _ a a ad, a where O rj isthe coefficient on z2 in the regression of Av + e on s, z, and z2and on zi inthe regression of z2 on zi and s. 55 isthe coefficient