The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
Working Paper Series A Road Map for Efficiently Taxing Heterogeneous Agents WP 13-13R This paper can be downloaded without charge from: http://www.richmondfed.org/publications/ Marios Karabarbounis Federal Reserve Bank of Richmond A Road Map for Efficiently Taxing Heterogeneous Agents Marios Karabarbounis Federal Reserve Bank of Richmond∗ July 23, 2014 Working Paper No. 13-13R Abstract This paper evaluates the quantitative potential of a tax system that depends on a rich set of household characteristics, such as the person’s age, his/her financial assets, and the number of working members in his/her household. The justification for this kind of reform is that workers respond differently to wage changes depending on how close they are to retirement, how wealthy they are, and whether they are the main financial provider in the family. Using a life-cycle model with heterogeneous, two-member households, I find that it is optimal to decrease tax rates on younger and older workers, wealthier households that are closer to retirement, and two-earner households. The government can raise revenues by targeting workers with a low value of labor supply elasticity, such as middle-aged workers living in a single-earner family. This new system generates large gains: Total supply of labor increases by 3.17%, the capital stock by 8.37%, and consumption by 4.88%. JEL Codes: E2; H21; H31. Keywords: Heterogeneous Agents; Labor Supply Elasticity; Life Cycle; Optimal Taxation. ∗ Contact information: Federal Reserve Bank of Richmond, Research Department, 701 Byrd St., Richmond, VA, 23219; email: marios.karabarbounis@rich.frb.org. I would like to thank Yongsung Chang and Jay Hong for their continuous advice during this project. I would also like to thank Yan Bai, Rudi Bachmann, Mark Bils, Nezih Guner, Ellen McGrattan, Jose-Victor Rios-Rull, Juan M. Sanchez, Gustavo Ventura, and seminar participants at ASU, Universitat Autonoma de Barcelona, Ecole Polytechnique, Federal Reserve Bank of Minneapolis, Federal Reserve Bank of Richmond, Federal Reserve Bank of St. Louis, SED Cyprus, and Vanderbilt. Earlier versions of this paper circulated under the title “Heterogeneity in Labor Supply Elasticity and Optimal Taxation.” The views expressed here are those of the author and do not necessarily reflect those of the Federal Reserve Bank of Richmond or the Federal Reserve System. All errors are my own. 1 Introduction This paper evaluates the quantitative potential of a tax system that depends jointly on a rich set of household characteristics. In particular, the government can use information not only on the household’s earnings, but also on the age of its members, their accumulated assets, and whether there is one working member or two working members. The justification for this kind of reform is that workers respond differently to wage changes depending on how close they are to retirement, how wealthy they are, and whether they are the main financial provider in the family. For example, a person closer to retirement is more likely to quit her job if her wage falls. This is also true if the person is part of a household with a large number of financial assets. In addition, the likelihood that this person will leave her job is even larger if she is not the only financial provider in the family. By decreasing tax distortions on workers who are very sensitive to wage changes, the government can minimize the efficiency loss of taxation and increase the size of the economic pie. To study the potential of such a reform I build an incomplete markets, heterogeneousagent model. Heterogeneity in the model is introduced through i) a life-cycle dimension, ii) permanent and temporary uninsurable labor productivity shocks, and iii) two-member households whose members make joint decisions about how much the household will save and who (the male or both male and female) will join the workforce.1 A household member will be part of the labor force if his/her reservation wage is lower than the wage offered in the market. A small increase in the market wage will affect only those members whose reservation wage is sufficiently close to the market wage, the marginal workers. Hence, heterogeneity in labor supply elasticity arises endogenously in the model from differences in reservation wages. These results stem from the important insights of Hansen (1985), Rogerson (1988), and especially Chang and Kim (2006). To discipline the model I use empirical evidence from the Panel Study of Income Dynamics (PSID). The model replicates very closely a wealth of labor market statistics such as the fraction of people working (employment rates) as well as the fraction of people moving between employment and unemployment (transition rates), for both the primary and the secondary earner. To go a step further, I undertake a novel approach by comparing the model’s estimates for reservation wages to self-reported reservation wages from the Survey of Income and Program Participation (SIPP). I find that the model replicates quite well the relationship between reservation wages and asset holdings as well as time horizon, documented in the SIPP. Matching the behavior of employment rates, transition rates, and reservation wages is important, since these statistics determine the value of the labor supply elasticity. So what does the optimal tax system look like? The revenue-neutral tax reform favors 1 This framework is related to heterogeneous-agent life-cycle models of labor supply with a single earner (Rogerson and Wallenius, 2009; and Erosa, Fuster, and Kambourov, 2013) or two earners (Guner, Kaygusuz, and Ventura, 2012a). 2 four groups of taxpayers: very young households (ages 21-30), older households (ages 5165), wealthier households closer to retirement, and dual-earner households. The new tax system raises revenues by targeting mainly middle-aged households (ages 31-50) with a single earner. However, the new system expands the economic pie to such an extent that a large part of the reform is self-financed from the newly employed workers. Older households have a larger stock of savings and fewer working years ahead of them. Hence, they are relatively sensitive to changes in their after-tax earnings. Their (Frisch) elasticity of labor supply is around 2.7, much larger than the average of 1.4. To encourage older households to delay retirement, the new tax code decreases their rates by around 5%. In contrast, middle-aged households have to pay on average 3% more of their income. At first glance, this feature seems to distort the working choice of relatively productive agents. However, the government can raise revenues at a small efficiency cost, since this group has a small labor supply elasticity (on average 1.0). Younger households also receive generous tax cuts, since at the start of their careers they receive relatively lower wages. Moreover, the new tax code decreases tax rates for households closer to retirement with a large amount of accumulated assets. For example, a household close to retirement with $40,000 in assets pays around 19% in taxes, while the same household with $100,000 pays just 16%. This way, the system encourages wealthier households to delay retirement and middle-aged households to build up savings in order to receive a tax cut later. Due to perfect risk-sharing within the household, the secondary earner is always less attached to her job than the primary earner. Males have an (extensive margin) labor supply elasticity of 0.9 and females of 1.8. To encourage female labor force participation, the new tax system decreases tax rates on dual-earner households. For example, average tax rates decrease for a two-earner 30-year-old household with median earnings by 4.8%, while they increase for a single-earner household with the same income by 4.7%. Given the new incentives, most of the single-earner households switch to a two-earner household, while only a small fraction switch to unemployment. These effects reflect the large labor supply elasticities for the secondary earner and the relatively lower elasticities for the primary earner in the household. The gains associated with the reform turn out to be large. Compared to the current U.S. economy, total supply of labor, measured in efficiency units, increases by 3.17%. Capital and consumption increase even more, by a large 8.37% and 4.88%, respectively. Although the new economy involves people spending on average 10% more of their time at work, the large increase in consumption leads to a sizable increase in welfare by 0.90%. Key to the welfare gains is the way the tax-tags (age, assets, and household status) interact within the optimal tax code. For example, although asset holdings alone cannot deliver substantial welfare gains, they can promote welfare significantly if they are part of a system that also uses information on age and household status. As a last exercise, I consider different versions of the model that incorporate i) a constant elasticity of labor supply, and ii) endogenous human capital accumulation. For 3 both exercises, I calculate how much we can gain by changing the tax code to the optimal tax system found in our benchmark model. The exercise highlights the crucial role of heterogeneity in labor supply elasticity in generating welfare gains. In contrast, I find that incorporating endogenous human capital into the model adds little to the welfare gains. The main contribution of this paper is to provide explicit guidelines on how to efficiently tax heterogeneous agents. To my knowledge, this is the first paper that evaluates jointly the quantitative potential of age-dependent, wealth-dependent, and householddependent policies using a model in which the individual labor supply elasticity depends endogenously on a rich set of household characteristics. Moreover, given our rich set of tax instruments, we can draw comparisons between our findings and several papers analyzing the shape of the optimal tax policy. For example, Weinzierl (2011) and Farhi and Werning (2013) find an increasing labor wedge to be optimal, i.e., to decrease distortions on younger and increase tax rates on older workers. I also find tax cuts to younger people to be optimal. However, unlike these papers, I find it optimal to decrease distortions for households closer to retirement. The government should raise revenues by targeting households strongly attached to their jobs, i.e., middle-aged households. Older households have a relatively larger amount of assets and have the option to retire early if their taxes increase. In this sense, this paper is closer to the findings of Conesa, Kitao, and Krueger (2009), who argue in favor of high capital taxation to implicitly tax very elastic old workers less.2 It is also of interest to compare our model with the recent findings of the dynamic optimal taxation literature. In particular, Kocherlakota (2005), Albanesi and Sleet (2006), and Kitao (2010) find it optimal to decrease capital income taxes for people reporting high labor earnings. This way, the government discourages people from oversaving while young and misreporting their true type when old. In my paper, the government decreases labor income taxes on wealthier households (but only if they are close to retirement). While this policy also encourages the labor supply of older workers, it does so without distorting the savings choice of the young. Actually, young and middle-aged workers will save more in anticipation of tax cuts closer to retirement. The paper also contributes to the discussion on the optimal tax treatment of families. The current U.S tax code discourages secondary earners from joining the workforce, as additional family earnings are taxed at a relatively higher marginal rate. With this in mind, Guner, Kaygusuz, and Ventura (2012b) quantitatively evaluate the effects of a gender-based policy in which married females face a lower tax rate at the expense of married (and sometimes single) males. Their main finding is that a gender-based tax cannot do better than a gender-neutral proportional tax. While this paper also 2 Their intuition is based on Erosa and Gervais (2002), who make an argument for tax rates that should follow the life-cycle labor supply profile. Both Erosa and Gervais (2002) and Conesa, Kitao, and Krueger (2009) choose a utility specification that allows the labor supply elasticity to vary inversely with working hours. In contrast, in my model, endogeneity in labor supply elasticity arises naturally through the presence of an extensive margin of labor supply and uninsurable idiosyncratic labor income shocks. 4 considers ways to encourage female labor force participation, it does so without resorting to explicit gender-based policies. In particular, I argue for tax cuts to both members of two-income households. In a simple comparison, I show that the two policies have different implications. A policy tagging household’s filing status (single- vs. dual-earners) instead of gender delivers much larger efficiency and welfare gains. So although quantitative in nature, this paper brings forward several qualitative insights regarding the optimal tax-tagging policy by highlighting i) the importance of heterogeneity in the elasticity of labor supply across households, ii) the interaction between multiple tags in the design of the optimal policy, and (iii) the potential of tagging a household’s filing status compared to other family-related policy tags such as gender. This paper is organized as follows. Section 2 constructs a simple example to develop intuition regarding the main results of the paper. Section 3 sets up the model. Section 4 describes the quantitative specification of the model and examines the implications of the model for the labor supply elasticity. Section 5 describes the main quantitative experiment and Section 6 different model specifications. Finally, Section 7 concludes. 2 Static Model This section builds a simple static model of labor supply to explain how to compute the labor supply elasticity and show how a simple policy reform can increase participation in the labor market. Each household has only one agent i who is endowed with asset holdings ai and has preferences over consumption c and hours worked h: (1 − hi )1−θ U = max log ci + ψ c,h 1−θ (1) ci = w(1 − τ )hi + (1 + r)ai (2) subject to where w is the wage rate per effective unit of labor, τ is the proportional tax rate, r is the real interest rate, and ai is i’s initial asset holdings. The parameter ψ defines the preference toward leisure and θ the intertemporal substitution of labor supply. Intensive Margin Adjustments The intensive margin is defined by how much existing workers change the amount of hours they supply in response to wage variations. Worker i equates the marginal rate of substitution between consumption and leisure to the real wage rate. ψ(1 − h(ai ))−θ = w(1 − τ ) c(ai ) (3) The optimal supply of hours h(ai ) depends on initial asset holdings. If worker i has a lot 5 of assets she will buy more leisure and work less (income effect). The (intensive) Frisch elasticity of labor supply for i is given by: 1 (1 − h(ai )) . θ h(ai ) (4) The preference specification makes the intensive margin labor supply elasticity endogenous to working hours. Agents working many hours will respond more inelastically than those working a few hours. Hence the amount of heterogeneity in the intensive margin elasticity of labor supply will depend on the distribution of hours across workers. Extensive Margin Adjustments The extensive margin of labor supply is defined by how many people enter or exit the labor market in response to wage variations. To make the extensive margin active, I assume that workers have to pay a fixed cost F C every working period. This cost will not affect the optimal choice of hours but will affect the decision to be employed in the first place. Worker i with initial asset holdings ai will participate if the value of employment V E (ai ) is at least as large as the value of being unemployed V U (ai ). These two are given by: (1 − h(ai ))1−θ V (ai ) = log(w(1 − τ )h(ai ) + (1 + r)ai ) + ψ − FC 1−θ E V U (ai ) = log((1 + r)ai ) + ψ 11−θ . 1−θ (5) (6) The reservation wage is the wage net of taxes that makes the agent indifferent about working or not. It is given by: " ( ) # 1−θ (1 + r)a (1 − h(a )) i i wR (ai ) = exp −ψ + const − 1 h(ai ) 1−θ 1−θ (7) where const = ψ 11−θ + F C. Participation amounts to w(1 − τ ) > wiR . Ceteris paribus, a rich agent will demand a higher wage to enter the labor market. The participation schedule is a step function and consists of three parts. If w(1 − τ ) < wiR , the worker is not participating. If w(1 − τ ) = wiR , the worker is indifferent about working or not. And if w(1 − τ ) > wiR , the worker enters the labor market. Worker i’s extensive margin elasticity depends on the distance between her reservation wage and the market net wage. If her reservation wage is much lower or much higher than the market net wage, small variations in the market wage will leave the worker unaffected. If her reservation wage is sufficiently close to the market wage, she is very elastic to wage variations. Workers whose reservation wage is sufficiently close to the market wage are the marginal workers. 6 Taking into account both the intensive and the extensive margin, we can construct the labor supply decision ( lis (wR (ai )) = h(ai ) if w(1 − τ ) ≥ wR (ai ) . 0 if w(1 − τ ) < wR (ai ) (8) Aggregate Response of Labor Supply Let the distribution of reservation wages R be denoted as φ(w ). The aggregate labor supply at the market wage w equals total Rw amount of hours supplied by people who are working: Ls (w) = 0 ls (wR )dφ(wR ). Then, differentiating with respect to the market wage and using the Leibnitz rule, we can decompose the aggregate labor supply elasticity to its intensive margin and extensive margin components. | L0 s (w)w Ls (w) {z Total Elasticity Rw 0 = | } l0 (wR )dφ(wR )w + Ls (w) {z } Intensive Margin Elasticity φ(w) ls (w)w s . L (w) | {z } (9) Extensive Margin Elasticity In a heterogeneous agents framework, the adjustment in total hours equals the adjustment in the intensive and the extensive margin. The first term at the right-hand side of equation (9) is the aggregate intensive margin elasticity. The magnitude of the response depends on the curvature of the labor supply function l0 . The second term at the right-hand side of equation (9) is the aggregate extensive margin elasticity. Its value depends mostly on the distribution of the reservation wages around the market wage φ(w). If the reservation wage distribution is very concentrated, the ratio Lφ(w) s (w) increases and hence the labor supply elasticity increases. The Hansen-Rogerson limit of infinite elasticity is reached if the reservation wage distribution is degenerate. On the other hand a dispersed reservation wage distribution will imply a small aggregate labor supply elasticity. marginal workers wR (a1 ) wR (a2 ) wR (a3 ) | {z workers wR (a4 ) wR (a5 ) wR (a6 ) } | wR (a7 ) wR (a8 ) {z non−participants w(1 − τ ) | {z } market net wage Figure 1: Reservation wages and marginal workers. 7 } Figure 1 displays how the model economy works. In this simple example there are eight agents. Each is endowed with initial asset holdings ai where ai < aj with i < j. The initial asset holdings distribution will imply a distribution of reservation wages φ(wR (a)). Low number agents participate in the labor market since their reservation wages are lower than the net market wage. High number, wealthy agents will stay out of the labor market since the net market wage is not high enough. In this example the employment rate is equal to 50%. A wage variation will affect mostly agents 4, 5, and 6 whose reservation wage is sufficiently close to the net market wage. These marginal workers have very high labor extensive margin elasticities. The larger the density of workers around the market wage the larger the aggregate response of the economy to a wage change. Agents 1, 2, and 3 will respond only at the intensive margin. This group features zero extensive margin elasticity. Finally, agents 7 and 8 have very large assets so they cannot be affected by small variations in the market wage. Hence, differences in reservation wages generate heterogeneity in labor supply elasticity. Tax Reform Since the government cannot identify directly which worker is more elastic, it can use information on their asset holdings. An example of such a (revenueneutral) tax code is the following: ( τ (a) = τH if a ≤ a3 . τL if a > a3 with τH > τL . Under this tax system, workers with low assets who also have a low labor supply elasticity pay higher labor income taxes. Figure 2 describes the outcome. Agents 1, 2, and 3 with low level of asset holdings pay taxes τH and receive a lower net wage w(1 − τH ). However their reservation wages are low enough to keep them employed. Adjustment will take place only at the intensive margin. Marginal worker 4 continues to work and pays lower taxes. Marginal workers 5 and 6 enter the labor market in response to the tax cuts. Under the new system they receive a higher net wage w(1 − τL ). Agents 7 and 8 are indifferent to this policy. The new policy increases employment. after−reform employment z }| wR (a1 ) wR (a2 ) wR (a3 ) | {z benchmark employment { wR (a4 ) wR (a5 ) wR (a6 ) wR (a7 ) wR (a8 ) } w(1 − τH ) | {z } received by 1,2,3 w(1 − τ ) | {z L} received by 4,5,6 Figure 2: Effects of new tax system on employment. 8 3 Fully-Specified Dynamic Model The model is an overlapping generations economy with production and endogenous labor supply decisions. The focus is only on a steady state equilibrium so I will abstract from any time subscript. Demographics The economy is populated by a continuum of households. Each household consists of two members, a male (m) and a female (f ). I will use the notation i = {m, f }. Both household members are assumed to be of the same age j. There are a total of J overlapping generations in the economy, with generation j being of measure µj . In each period a continuum of new households is born whose mass is (1 + n) times larger than the previous generation. Conditional on being alive at period j − 1, the probµ sj ability of surviving at year j is sj . Hence, µj+1 = 1+n . The weights µj are normalized j so that the economy is of measure one. Households whose members reach age jR have to retire. Retirees receive Social Security benefits ss financed by proportional labor taxes τss . Agents have the option to exit the labor market early but if they do so, they will not receive Social Security benefits before the age of j R .3 Timing The timing of events can be summarized as follows. 1. At the beginning of the period exogenous separations occur. A fraction λ of previously employed households is excluded from the labor market.4 2. Idiosyncratic productivity is realized for each household member. 3. All households make consumption and savings decisions. Households that didn’t lose their jobs (the fraction 1 − λ) make decisions about who will join the workforce. Preferences Households derive utility from consumption (c) and leisure. Both members are endowed with one unit of productive time, which they split between work (hm and hf ) and leisure. Households’ decisions depend on preferences representable by a time separable utility function of the form " U = E0 J X j=1 β j−1 J Y j=1 1−θ )# ( sj 1−θ f (1 − hm j ) f (1 − hj ) m + ψj log cj + ψj 1−θ 1−θ 3 (10) If such a case was allowed, early retirees would start retirement with a lower amount of money in their retirement fund than late retirees. This is exactly what happens in this model when early retirees start eating their assets earlier and hence have a lower amount of money throughout retirement than late retirees. Since both modeling techniques have the same implications about retirees’ wealth, I choose the simpler modeling assumption. 4 The reason both household members and not each individually is assumed to lose their job is just for simplicity. 9 where β is the discount factor and θ affects the Frisch elasticity of labor supply. While males can choose any allocation between work and leisure, females can only choose between working a given amount of hours or not at all (indivisible labor). Hence hfj = {0, h̄}. Note that I do not allow a case where only the female is working. In addition, I make the assumption that leisure is valued differently by households at different ages. This will help target the participation rates of secondary earners (due to indivisible labor) and the average hours conditional on participation for primary earners. Productivity Every period, workers receive wages ŵ which depend on the prevailing market wage w, their skill z, their experience j , and a persistent idiosyncratic shock x. Skills are distributed across households as log(z) ∼ N (0, σz2 ). I assume that household members share the same level of skill.5 The age-specific productivity profile {ij }Jj=1 is deterministic and captures differences in average wages between workers of different ages. Note that primary and secondary earners face different profiles. Finally each household member draws an idiosyncratic shock that follows an AR(1) process in logs: with ηj ∼ iid N (0, ση2 ). log xj = ρ log xj−1 + ηj , (11) Following Attanasio, Low, and Sanchez-Marcos (2008), I assume that both the primary and the secondary earner draw from the same process. However the specific realization of x may very well differ between members. As usual, the autoregressive process is approximated using the method developed by Tauchen (1986). The transition matrix, which describes the autoregressive process, is given by Γxx0 . Summing the natural logarithm of wage for member i of a household of skill type z and age j is given by log ŵji = log w + log z + log ij + log xij . (12) Asset Market and Borrowing Constraints The asset market has two distinct features. The first is that markets are incomplete. Within the set of heterogeneous agents life-cycle models such an assumption is standard. From an empirical standpoint incomplete markets support the evidence that consumption responds to income changes. At the same time, in the absence of state-contingent assets agents use labor effort to insure against negative labor income shocks. This mechanism lowers the correlation between hours and wages, a pattern well documented in the data (Low, 2005, and Pijoan-Mas, 2006). With this in mind, I restrict the set of financial instruments to a risk-free asset. In particular, households buy physical claims to capital in the form of an asset a, which costs 1 consumption unit at time t and pays (1 + r) consumption units at time t + 1. r is the real interest rate and will be determined endogenously in the model by the intersection of 5 There is ample evidence that schooling decisions of husband and wife are positively correlated. Pencavel (1998) reports that the odds of being married to someone with the same schooling level is 1.03 and the odds of being married to someone with almost the same years of schooling is 8.62. 10 aggregate savings to aggregate demand for investment. The second feature is a zero borrowing limit.6 This assumption can greatly affect labor supply responses.7 In the model, savings takes place for three reasons. Households wish to smooth consumption across time (intertemporal savings motive), to insure against labor market risk (precautionary savings motive), and to insure against retirement (life-cycle savings motive). Production There is a representative firm operating a Cobb-Douglas production function. The firm rents labor efficiency units and capital from households at rate w (the wage rate per effective unit of labor) and r (the rental rate of capital) respectively. Capital depreciates at rate δ ∈ (0, 1). The aggregate resource constraint is given by C + (n + δ)K + G = f (K, L) (13) where C is aggregate consumption, K is aggregate capital, and L is aggregate labor measured in efficiency units. G represents government expenditures. Equation (14) equalizes total demand and total supply. The latter equals output produced by the technology production f (K, L). Government The government operates a balanced pay-as-you-go Social Security system. Households receive Social Security benefits ss that are independent of the members’ contributions and are financed by proportional labor taxes τss . This payroll tax is taken as exogenous in the analysis. In addition, the government needs to collect revenues in order to finance the given level of government expenditures G. To do so it taxes consumption, capital, and labor. Consumption and capital income taxes τc , τk are proportional and exogenous. Households file a single (SN) or a joint (JN) tax return based on whether it is a single or two-earner household.8 . Tax rates are computed based on a household’s total pre-tax labor earnings W = ŵm hm + ŵf hf with ŵ = wzj x using a nonlinear tax schedule of the form: TLSN (W ) = W − (1 − τ0 )W 1−τ1 (14) TLJN (W ) = W − (1 − τ0 )W 1−τ2 . (15) In the case of single filing, by definition we have that only the male is working i.e. W = ŵm hm . If τ1 = 0 (and similarly τ2 ), the tax function becomes a proportional tax schedule. 6 The reason the limit is zero instead of a small negative value is the presence of stochastic mortality. If borrowing was allowed, some net borrowers would die (unexpectedly) without having paid their debt. 7 According to Domeij and Floden (2006) borrowing constrained individuals can smooth their consumption only by increasing their labor supply. Hence, on the presence of borrowing constraints the labor supply elasticity is downward biased. 8 In reality the US tax system is much more flexible. For example, households where both members are working can choose between filing jointly or separately. In addition, households can file jointly even if the spouse has no income. In this paper for simplicity I associate single and joint filing status to the number of working members in the household. 11 For τ1 > 0 the system becomes progressive since high earners pay a higher fraction of their earnings in taxes. I model both parameters τ1 , τ2 to reflect different marginal tax rates faced by single and joint filers in the U.S. tax system. The parameter τ0 affects the average and marginal tax rate in the same way. Higher values of τ0 imply that working agents face both higher average and marginal tax rates. This specification is used by Heathcote, Storesletten, and Violante (2014). Finally, the government uniformly distributes the accidental bequests (due to stochastic mortality) to all living households. These transfers are denoted T r. Fixed Cost and Search Cost To introduce participation decisions for the primary earners, I assume that they have to pay a fixed cost every time they work (participation is the only possible decision for the secondary earner). The fixed cost F Cj is expressed in utility terms and depends on age. In addition, I assume that people who were unemployed at age j − 1 have to pay an extra cost in order to work at age j rationalized as a search cost scj . This means we have to track previous employment status S−1 = {u, e} for each household member. Note that both the fixed cost and the search cost depend on age. In summary, the total cost of working for primary earners is ( ζjm (S−1 ) = m F Cj + scm if S−1 =u j . m F Cj if S−1 = e (16) The total cost of working for secondary earners is ( ζjf (S−1 ) = scfj 0 f if S−1 =u . f if S−1 = e (17) Household’s problem Households are indexed by their skill type and their age (z, j). Additional heterogeneity is faced with respect to the amount of asset holdings a, the stochastic productivity components of its members xi = {xm , xf }, and the previous emf m }. A household’s decision is constrained ployment status for each member S−1 = {S−1 , S−1 0 by the limited borrowing constraint a ≥ 0 and the nonnegative consumption constraint c ≥ 0. In the following problems, I take these constraints as given. The value function for a household of skill z and age j is denoted by VzjEE when both members are working, is denoted by VzjEU when only one member is working, and by VzjU U and when both members are out of the labor market. In particular: 12 ( VzjEE (a, x, S−1 ) = max 0 m 1−θ log(c) + ψjm c,a ,h βsj+1 (1 − h̄) (1 − hm )1−θ + ψjf 1−θ 1−θ XX xm 0 xf Γxm x0m Γxf x0f 0 m − ζ(S−1 ) − ζ f (S−1 )+ U (1 − λ)Vz(j+1) (a0 , x0 , S) + λVz(j+1) (a0 , x0 ) (18) s.t. (1+τc )c+a0 = (1−τss )(ŵm hm +ŵf h̄)−TLJN (ŵm hm +ŵf h̄)+(1+r(1−τk ))(a+T r) (19) x0m ∼ Γxm ,x0m (20) x0f ∼ Γxf ,x0f (21) S m =e (22) f S =e (23) Equation (19) is the household’s budget constraint. As usual consumption and savings equal after-tax labor and capital income. Transfers from accidental bequests are part of the budget constraint. Equations (20-23) describe the evolution of the state variables. Productivity x evolves according to the autoregressive process. In addition, next period’s employment status S will be e for both household members. The value function for the unemployed household is given by the following equation. ( VzjU (a, x) = max 0 log(c) + c,a +βsj+1 XX xm 0 xf ψjf ψjm + 1−θ 1−θ Γxm x0m Γxf x0f 0 U (1 − λ)Vz(j+1) (a0 , x0 , S) + λVz(j+1) (a0 , x0 ) (24) s.t. (1 + τc )c + a0 = (1 + r(1 − τk ))(a + T r) (25) x0m ∼ Γxm ,x0m (26) x0f ∼ Γxf ,x0f (27) 13 Sm = u (28) f S =u (29) Notice that in this case S−1 is not a state variable. Moreover, if either member decides to work next year, he/she will have to pay the search cost (the continuation value includes employment status S = u). The value function for a household where only the male is working can easily be deduced keeping in mind that the spouse is not working h̄ = 0, the household files a single tax return T = T SN , and that the spouse has to pay the search cost if she decides to return to the workforce next period, i.e. S f = u. Household members decide who will join the workforce by comparing Vzj = max hm ∈{0,hEU ,hEE } {VzjEE , VzjEU , VzjU } (30) where hEU is the primary earner’s optimal hours choice if the spouse does not work, while hEE is his choice if the spouse is also part of the workforce. The problem for the retirees is similar to the unemployed with the exception of the Social Security benefit received every period. It is not displayed for convenience. Distribution of states The state space is defined as Ω = A × X × Z × Σ. A = [0, a] is the asset space. The lower bound of zero is based on our no-borrowing assumption. Since the agents cannot save more than what they earn over their lifetime, we can safely assume an upper bound a. X = R is the productivity space for the primary and the secondary earner, and Z = R is the space for the household’s skill level. Σ = {ee, eu, uu} is the set of possible values for the previous employment status of the household’s members. The a c hm (ω), gzj (ω) and gzj (ω), policy function for savings, consumption and, hours is given by gzj hf gzj (ω) respectively. Let Φzj (a, x, S−1 ) denote the cumulative probability distribution of states (a, x, S−1 ) ∈ Ω across households of type (zj). The marginal density is denoted by φzj (a, x, S−1 ). Equilibrium The model is solved in general equilibrium. The equilibrium is described in a recursive way. I focus on a stationary equilibrium where prices and aggregate variables are constant. Specifically, given a tax structure {τc , TLSN (.), TLJN (.), τk , τss } and an initial distribution Φz1 (a = 0, x, S−1 = {uu}) a stationary competitive equilibrium a c hm hf J consists of functions {VzjEE , VzjEU , VjzU , gzj , gzj , gzj , gzj }j=1 , prices {w, r}, inputs {K, L}, benefits {ss}, transfers {T r} and distributions {Φzj (a, x, S−1 )}Jj=2 s.t. • given prices {w, r}, benefits {ss}, and transfers {T r} the functions solve the household’s problem; 14 • the prices satisfy the firm’s optimal decisions, r = FK (K, L) − δ and w = FL (K, L); • capital and labor markets clear: K= J−1 X j=1 Z µj+1 a gzj φzj and L = Ω J X Z Ω j=1 • the Social Security system clears: τss wL = ss hm f f hf (zxm m j gzj + zx j gzj )φzj ; µj J X µj ; j=j R Z • the transfers are given by: T r = µj (1 − sj )gja ; Ω • the government balances its budget: G = τc C + τk rK + P R i=SN,JN Ω TLi (.)dφ • the distribution of states for households with skill level z that are currently working evolves based on the following rule: φz(j+1) (a0 , x0 , {ee}) = X XX Γxm x0m Γxf x0f φj (ga −1 (a0 , .), x, S−1 ) S−1 ={ee,eu,uu} xm 0 xf 0 To understand the last condition note that φz(j+1) (a0 , x0 , {ee}) is the density of households at age j + 1 with assets a0 , productivity vector x0 and whose members were both working a at age j. This measure will consist of different households that saved a0 = gzj (a, x, S−1 ). −1 0 The inverse function ga (a , x, S−1 ) gives the amount of assets a needed to save a0 given the productivity vector x. From people with states a, x that lead to savings a0 only Γxm x0m Γxf x0f will move to (a0 , x0 ). The sum is taken all over possible values of xm , xf . The outer sum denotes that this rule holds for age j households with any kind of employment status at j − 1. We can construct similar rules for other states. 15 4 Quantitative Analysis 4.1 Stylized Facts on Life-Cycle Labor Supply I use data from the PSID waves from 1970 to 2005 and collect information on male primary earners as well as secondary household members. I exclude households that consist of a female primary earner (see Appendix A for a description of the data). An agent is regarded as employed if he/she works more than 800 hours annually (15 hours per week). The key patterns emerging from the analysis are the following: 1. For males, annual working hours are roughly hump shaped over the life cycle. However, conditional on participation, males’ lifetime labor supply varies little. This means that life-cycle variations in average hours are mainly driven from the participation margin. 2. Average participation for females is lower than males (62% versus 88%). Participation is modest during the childbearing years. As a result the participation profile for females peaks at the age 50, much later than males.9 3. The probability of being employed at time t + 1 is very high (around 95%) for employed males at time t. The probability decreases only after age 60. The probability of switching to employment at time t + 1 for unemployed males at time t is decreasing along the life cycle. This implies that unemployment becomes an absorbing state. Females’ labor supply follows similar patterns although the transition rates are lower, reflecting a smaller participation rate. 4. The relationship between labor-market participation and asset holdings is also non-monotonic. Workers at the tails of the wealth distribution work less than workers with median asset holdings. These patterns are consistent with other studies focusing on the life-cycle labor supply of males (Prescott, Rogerson, and Wallenius, 2009, and Erosa, Fuster, and Kambourov, 2013) and females (Attanasio, Low, and Sanchez-Marcos, 2008). As shown in Section 4.3, our model succeeds in replicating these facts very closely. 9 The life-cycle profile of employment for females is constructed taking into account that the life-cycle behavior varies significantly across women of different cohorts (see Appendix D for more information). 16 4.2 Calibration This section describes the calibration of the model. I calibrate a group of parameters based on values used in the literature. Then I choose the remaining parameters so that the associated stationary equilibrium is consistent with the U.S. data along several dimensions. The parameter values are summarized in Appendix E. Externally Set Parameters The model period is set to one year. The agents are born at the real life age of 21 (model period 1) and live up to a maximum real life age of 101 (model period 81). Agents become exogenously unproductive and hence retire at the real life age of 65 (model period 46). The survival probabilities are taken from the life table (Table 4.C6) in Social Security Administration (2005). I use an average of the survival probabilities reported for males and females. The population growth rate is set to n = 1.1%, the long-run average population growth in the United States. The production function is Cobb-Douglas, f (K, L) = K α L1−α , where α = 0.36 is chosen to match the capital share. As already noted, preferences are separable in consumption and leisure. Parameter θ, which determines the Frisch labor supply elasticity, is set to 2. This is based on Erosa, Fuster, and Kambourov (2013). The time endowment equals 5,200 hours per year (Prescott, Rogerson, and Wallenius, 2009). The secondary earner can work for h̄ = 0.34 since in the PSID females (who participate in the labor market) work on average 1,786 hours annually. The deterministic age-dependent productivity profile is estimated from the PSID using real hourly log-wages. A hump-shaped profile emerges for both males and females. The female to male hourly wage ratio is found to be 0.72, which is identical to the value of 0.72 that I calculate by using the numbers reported by Blau and Kahn (2000) for the periods 1978-1998.10 For the tax rates, I use values based on Imrohoroglu and Kitao (2012). The consumption tax is set at τc = 5% and the capital tax rate at τk = 30%. The Social Security tax is set at τss = 10.6% based on Kitao (2010). This gives a replacement ratio around 45%. Finally, we need to pin down the parameters τ1 and τ2 . The functional form of our tax functions implies that the after-tax earnings is log-linear in pre-tax earnings. I estimate the parameters τ1 and τ2 using data from CPS for single and joint filers respectively for the time period 1992-2007. The values are τ1 = 0.073 and τ2 = 0.065. Parameters calibrated within the model There are a total of 24 parameters to be calibrated. In a general equilibrium framework all parameters affect all moments. However, in order to give a sense of how the calibration works I associate a specific parameter to a given moment. • Discount factor (β): The discount factor affects directly the level of aggregate savings. 10 In spite of the wages being estimated on a sample of working females, our calibration does not suffer from significant selection bias. See Appendix F for a discussion. 17 Discounting the future at higher rates leads to more savings and a higher capital-output ratio. The discount factor targets a capital-output ratio equal to 3.2. • Depreciation rate (δ): Using the steady state relationship I = (n + δ)K, we can easily pin down the depreciation rate as δ = 0.25 leads to a value of δ = 0.0816. I Y K Y − n. Targeting an investment-output ratio of • Fixed costs F Cj : The fixed cost discourages primary earners from participating in the labor market. I assume that individuals before age 45 face a fixed cost equal to f c1 . After that age the fixed cost is given by F Cj = f c2 + f c3 j. To find the three values, I target the average employment rate at three stages of the life cycle: early working years (ages 21-35), middle ages (35-50), and for the rest of the life cycle (ages 51-65) equal to 0.92, 0.93, and 0.75, respectively. • Utility parameter for secondary earners (ψjf ): These parameters capture the relative preference toward work. Higher values of ψ f decrease the willingness of females to participate in the labor market. To pin down ψjf I target the inverse U shaped participation profile for females. In particular, I assume the following relationship ψjf = γ0f + γ1f j + γ2f j 2 + γ3f j 3 + γ4f j 4 and use the average participation rates across five different age groups (21-30, 31-40, 41-50, 51-60, 61-65) to pin down the γ f ’s. • Utility parameter for primary earners (ψjm ): I use ψj to match the slightly humpshaped profile of hours conditional on participation. Again I assume a relationship ψjm = γ0m + γ1m j + γ2m j 2 + γ3m j 3 + γ4m j 4 and use average working hours conditional on participation across five different age groups (21-30, 31-40, 41-50, 51-60, 61-65) to pin down the γ m ’s. • Separation rate (λ): A higher separation rate increases the transitions from employment to unemployment. I use the average probability of entering unemployment equal to 5.50% as a target. f • Search costs (scm j , scj ): The search cost disciplines the transitions between unemployment and employment. For both primary and secondary earners I assume the following form scj = η0 + η1 j and use the average transition probability for males and females between ages 21-42 and 43-65 to calculate a total of four parameters. • Tax parameter (τ0 ): This parameter is pinned down so that in equilibrium the government spending to output ratio equals 0.22. • Productivity parameters (σz , ρ, ση ): To pin down the last three parameters I follow the identification strategy of Storesletten, Telmer, and Yaron (2004). My main target is the 18 life-cycle profile of the variance of log labor earnings. Using information from the PSID I find that the variance evolves in a linear manner. The profile starts from 0.27 at age 21 and increases linearly to 0.75 by the age of 65. In this model all agents start off their lives having the same transitory shock x. As a result, any dispersion in labor earnings is caused by the dispersion in the fixed effect z, i.e., by the parameter σz . As the cohort ages the distribution of transitory shocks converges towards its invariant distribution. The variance of log labor earnings at the stationary distribution is pinned down by the variance of the transitory shock, ση . Lastly, the persistence of the transitory shock determines how fast we get to the invariant distribution. The slower the rate the flatter the slope of the life-cycle variance. This helps pin down ρ. 4.3 Model’s Performance Our calibration strategy left a rich set of statistics untargeted. A good way to test the model is to examine how the model performs with respect to these untargeted moments. Good performance builds confidence to use the model for policy recommendations. Life-Cycle Profiles of Employment and Hours The upper two panels of Figure 3 plot the life-cycle profiles for participation of both males and females, the average working hours for males and the average working hours conditional on participation again for males. Our calibration targeted the average participation rate of males between 21-35, 36-50, and 51-65. The upper left panel of Figure 3 examines how well the model fits the whole life-cycle profile. In the model, employment features the three phases observed in the data. Firstly, an increasing profile up to age 30. Agents receive relatively lower wage offers at the beginning of their career. They reason they can afford staying out of the labor market during the first years is some ownership of asset holdings (from accidental bequests). Gradually, as productivity increases, they decide to enter the labor market. The second feature of the data captured by the model is a flat, very persistent profile at middle ages. There are two reasons why agents at this age are very strongly attached to their labor market status. The first is very high productivity. The second is the search cost, which deters people from going in and out of employment at regular time intervals. Finally, the model replicates the steep decline in employment rates after age 50, generated by a large stock of accumulated savings and a declining average life-cycle productivity. The model also replicates the inverted U-shaped life-cycle profile of female participation. Unlike males, females tend to postpone labor market entry for a longer time. The fixed cost of working (the preference parameters ψjf ) is calibrated at a relatively high value in the first period of the life cycle11 . As a result, and given perfect 11 Note that the model can capture labor market participation for both males and females reasonably well, even in the absence of age-dependent parameters. See Appendix G for a discussion. 19 consumption insurance within the household, females stay out of the market for a longer time. In the model, 87.2% of males and 61.8% of females participate in the labor market. In the data, these numbers are 87.1% and 62.3%, respectively. Average Hours (Primary Earner) Participation Rates 0.5 0.9 0.45 0.8 0.4 0.7 0.35 0.6 0.3 0.5 0.25 0.4 PSID Model 0.3 0.2 0.15 0.2 0.1 20 30 40 50 60 0.1 20 70 30 40 50 60 70 Age Age Transition Rates (Primary Earner) Transition Rates (Secondary Earner) 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 20 30 40 50 60 0 20 70 Age 30 40 50 60 70 Age Figure 3: Upper Left Panel. Participation Rates for Primary Earner (top graph) and Secondary Earner (bottom graph). Upper Right Panel. Average Hours for Primary Earner Conditional on Participation (top graph) and All Sample (bottom graph). Lower Left Panel. Transition Rates for Primary Earner Employment to Employment (top graph) and Unemployment to Employment (bottom graph). Lower Right Panel. Transition Rates for Secondary Earner Employment to Employment (top graph) and Unemployment to Employment (bottom graph). 20 The upper right panel of Figure 3 plots average working hours for primary earners conditional on participation. Many factors affect this profile. To build intuition we write the Euler equation for hours (without uncertainty). θ 1 − hm ψjm m j+1 j ( ) = βsj+1 (1 + r(1 − τk )) m m 1 − hm ψ j j+1 j+1 (31) j The profile depends on the life-cycle productivity j+1 . In addition, the profile depends on the calibrated value of βsj+1 (1 + r(1 − τk )). This value is approximately 1.02, which decreases the average hours over the life cycle. Lastly, the profile depends on θ. Higher values of θ imply smaller intensive margin labor supply elasticity and smaller response of hours to wage and interest rate changes. Hence, low values of θ imply a flatter hours ψjm profile. To better match the profile, I use the preference parameters ψm . j+1 Life-Cycle Transitions The lower left and right panel of Figure 3 plot the average transition rates for the primary and secondary earner respectively. The top graphs in both panels plot the probability of moving from employment to employment while the bottom graphs plot the probability of moving from unemployment to employment. The separation rate λ targeted the average transitions between employment and employment for primary earners. The model is able to match the very flat probability of staying employed within a year and the decreasing part after age 60. The model seems to overpredict the probability of staying employed for secondary earners.12 The model also matches a decreasing life-cycle probability of switching from unemployment to employment for both males and females. To discipline these profiles I used the search cost parameters. In the presence of the search cost workers spread their working years as little as possible and (most of them) retire once and for all once they have accumulated enough assets. This explains the decreasing profiles and especially the small probability of moving to employment for unemployed agents close to retirement. Wealth-hours correlation and wealth inequality In Table 1 I report participation rates across households of different wealth. Wealthy workers a) can easily switch to unemployment since they can use their assets to smooth consumption and, b) have a strong incentive to be employed since they probably earn high wages. In the data these two effects produce a nonmonotonic relationship with income effects being stronger only for the very rich. The model mimics this nonmonotonic relationship between assets and participation even though the average participation is lower than what the data suggest. We should note that this is not a failure of the model and can be explained by the limited availability of information about wealth in the PSID. The model is 12 We could match the profile by introducing different separation rates for females as we did for males. However, this would increase the computational complexity with minor implications regarding the main results. 21 calibrated to match the participation rates between 1971-2005. Wealth can be found only in specific waves in the PSID. So it should not come as a surprise that the average participation in the model is different than in the data in Table 1 but not in Figure 3. Table 1: Participation by Wealth Quartile Wealth Quartile Data (Primary Earner) Model (Primary Earner) Data (Secondary Earner) Model (Secondary Earner) 1st 0.82 0.77 0.56 0.51 2nd 0.89 0.84 0.64 0.63 3rd 0.88 0.80 0.65 0.61 4th 0.83 0.75 0.61 0.57 Table 2: Wealth Gini by Age Age Group Gini (PSID) Gini (Model) 21-30 0.8252 0.5959 31-40 0.7966 0.5657 41-50 0.7804 0.5459 51-60 61-65 0.7403 0.7527 0.5353 0.5579 It is also important to test if the model can generate a realistic amount of wealth heterogeneity not only at the aggregate level but also within age cohort. Table 2 reports wealth Gini coefficients across age groups as found in the PSID and in the model. I find that in the PSID the coefficients are highest for households in their 20s and weakly decreasing between the ages of 30 and 60 with a small increase for people between ages 60-65. The model can replicate accurately this U-shaped profile. However, the model is not able to generate a wealth distribution as concentrated as in the data. The failure to generate a highly concentrated wealth distribution is standard in models with incomplete markets and idiosyncratic risk. To test the implications of this issue, I create an economy which artificially matches the wealth Gini observed in the data. In particular, I generate some very rich individuals by calibrating the labor income variance as well as the probability of moving to lower income levels. In this economy, the labor supply elasticity (defined and calculated in Section 5) is approximately 0.2 percentage points higher than in the benchmark economy. The reason is that there are more rich people in the economy who can move more easily between employment and unemployment. The reason the difference in the estimates is relatively small is that these very rich, very productive individuals, will most likely not be marginal, and thus relevant for the labor supply elasticity. 22 4.4 Labor Supply Elasticity In this section I present how the labor supply elasticity varies across population groups. To compute the labor supply elasticity I simulate the effects of a one-time unanticipated increase in the wage. Since the increase is small there are no wealth effects. Hence, we can interpret the elasticity as a Frisch elasticity of labor supply (Blundell, Costa-Dias, Meghir, and Shaw, 2013). Table 3 presents our results. The intensive margin labor supply elasticity is the percentage change in labor supply in response to a one-percent change in the wage for previously employed workers. This elasticity can only be calculated for males. The intensive margin labor supply elasticity is 0.64. This value depends crucially on parameter θ, which is calibrated at the value of 2. Intensive elasticities across age groups range from 0.62 to 0.71. At the same time the elasticity decreases on wealth. The variation is insignificant though with all groups ranging between the values of 0.63 and 0.65. In general the dispersion of intensive elasticities is small because conditional on participation primary earners work more or less the same amount of hours. More interesting are the findings regarding the extensive margin labor supply elasticity for both primary and secondary earners. The extensive margin labor supply elasticity is the percentage change in labor supply in response to a one-percent change in the wage due to individuals joining the workforce. The elasticity depends on the relative density of marginal workers around the market wage. I find a labor supply elasticity of 0.99 for males and 1.83 for females. The aggregate extensive margin elasticity seems to account for about 68% of the total value. What sets the extensive margin apart is the significant variation in labor supply elasticity across groups. Males at younger ages have smaller participation elasticities relative to people closer to retirement. A shorter time horizon makes it easier for older households to switch to retirement since they find it less costly to give up their job. At the same time they can afford to do so since they have a larger amount of assets. In contrast, middle-aged groups have a very large incentive to work as they are receiving high wages and they need to start building up their retirement savings.13 For example, male workers between ages 31-40 have elasticities around 0.44 while those after age 60 an elasticity of 4.98. Although females exhibit the same age pattern, the average value is much larger. For example, even at ages 31-40 females have elasticities of 1.81, much larger than the value of males. Females can move easily between employment and unemployment as they are never the only financial provider in the household. The assumption of perfect risk sharing within the household is crucial to generate this result. Finally, the relationship between asset holdings and labor supply elasticity is nonmonotonic. Households with a lower than the median amount of financial assets have a large elasticity of labor supply, probably since they receive low wages. At the fourth 13 I verify that the age-pattern is true even if we look within wealth groups. Hence, it is both a shorter time horizon and a larger amount of assets that make older workers more elastic. 23 quartile of the wealth distribution, the elasticity is also large since these individuals have a larger outside option. Table 3: Labor Supply Elasticity Age group 21-30 31-40 41-50 51-60 61-65 Wealth Quartile 1st 2nd 3rd 4th Aggregate Intensive-Males Extensive-Males Extensive-Females 0.62 0.65 0.63 0.70 0.71 1.35 0.44 0.50 0.98 4.98 1.80 1.81 1.36 1.66 5.01 0.65 0.62 0.63 0.63 0.64 0.96 1.18 0.79 1.05 0.99 1.96 1.90 1.64 1.83 1.83 Comparison with the Literature There is an extensive literature and a wide range of methodologies regarding the measurement of labor supply elasticity for both males and females. For example, Erosa, Fuster, and Kambourov (2013) calculate the labor supply elasticity within a model with incomplete markets and nonlinear wages. The aggregate labor supply elasticity in their paper is 1.27 with the extensive margin accounting for almost 50% of the aggregate value. I find that the extensive margin accounts for 68%, a value very close to 70% reported by Kimmel and Kniesner (1998). Chang and Kim (2006) show that an indivisible labor economy calibrated to match heterogeneity in wages and participation rates gives a labor supply elasticity of around 0.9 for males and 1.1 for females. Blundell, Costa-Dias, Meghir, and Shaw (2013), find an extensive margin labor supply elasticity for females of 0.9. Kimmel and Kniesner (1998) report an extensive margin labor supply elasticity for males equal to 1.25 and equal to 2.39 for females (see Keane, 2011). My values of 0.99 and 1.83, are relatively closer to their range.14 There is also some work conducted on the issue of group level elasticities. Rogerson and Wallenius (2009) find employment responses to a wage change that are concentrated among young and old workers. Erosa, Fuster, and Kambourov (2013) find elasticities of 1.0 for agents around 25-35 and 1.98 for individuals aged 55-64. Gourio and Noual (2009) focus on younger cohorts and report a decreasing pattern of labor supply elasticity with younger people being more elastic than the middle-aged. Jaimovich 14 Most of the literature that focuses on the intensive margin points to relatively small labor supply elasticity, especially for males. For example, MaCurdy (1981) finds a value equal to 0.15 for the Frisch labor supply elasticity. Pistaferri (2003) reports a higher value of 0.70. Our value of 0.64 is close to the upper bound of these estimates. 24 and Siu (2009) report that young and old cohorts experience much greater cyclical volatility in hours than the prime-aged. Lastly, French (2005) simulates a life-cycle model and finds that at age 40 the labor supply elasticity is around 0.25 while at age 60 it is around 1.15. My findings are consistent with the age-profiles reported in these papers. 4.5 Indirect Diagnosis- Evidence from the SIPP In this section I will discuss an indirect way to validate the model-generated estimates of the labor supply elasticity. In particular, I compare the model’s estimates for reservation wages to direct evidence on reservation wages from the Survey of Income and Program Participation (SIPP). To my knowledge, this is the first paper to use empirical evidence on reservation wages to test the predictions of a heterogeneous-agent model. Model 6 2 Survey of Income and Program Participation Males Age 51−65 1.6 5 Females Age 51−60 4 3 Males Age 51−65 Males Age 21−50 0 0 1 .4 Females Age 51−65 Females Age 21−50 2 Females Age 21−50 Reservation Wages Reservation Wages .8 1.2 Males Age 21−50 0 2 4 6 8 10 12 14 0 Assets 10 20 30 40 50 Assets Figure 4: Reservation wages as a function of assets for two age groups (21-50 and 51-65) and for both gender groups. Left Panel. Evidence from the Survey of Income and Program Participation. Right Panel. Model generated reservations wages. Information about reservation wages is available in the topical module of Wave 5 for 1984. The SIPP sample design consists of 21,000 household units. Each household was interviewed at four-month intervals and was asked questions about the four months before the interview day. The data offers information on household residents like education, age, gender, race, marital status, etc., as well as employment history for the past four months. Individuals who also experienced at least one spell of unemployment in between the interviews are also asked about the minimum wage they would be willing to work for. In addition, the SIPP makes available information on the total net worth of the household assets.15 While empirical evidence is very likely to suffer from measurement 15 To be precise, information about assets is included in the topical module of Wave 4. Alexopoulos and Gladden (2002) state that collective evidence supports that wealth information from the SIPP is comparable to the wealth information from the PSID. 25 60 error, it would be informative to draw some comparisons between reservation wages in the model and in the data. In Figure 4 I plot linear fits between assets and reservation wages in the data and in the model (all quantities are normalized to their means). The regressions are plotted for two age groups (21-50 and 51-65) and separately for males and females. Three patterns stand out looking at Figure 4. 1) In the SIPP the correlation between assets and reservation wages is positive, a pattern also confirmed in our model.16 2) Time horizon also matters for reservation wages: The reservation wages of older individuals are higher and more responsive to asset holdings than those of younger individuals. Younger-poor females are an exception as they seem to ask for higher wages than wealth-poor older females. The model is, in general, consistent with these predictions. 3) In sharp contrast with the model’s predictions, in the SIPP females have lower reservation wages than males. To understand this counter-intuitive result, we have to note that most of the respondents in the SIPP report a reservation wage very close (usually slightly lower) to their wage at their last job. The correlation between reservation wages and wage at the last job is 0.739. So if females receive on average lower wages they will also report lower reservation values. Thus, a more meaningful comparison might be to compare reservation wages relative to wages at the last job between males and females. Indeed, females ask on average 86% of their last wage while males ask 82% of their last wage. So, in general, our model is consistent with the evidence on reservation wages documented in the SIPP, especially regarding the effects of assets and time horizon. 5 Optimal Tax System This section sets up the main quantitative experiment. In particular, I construct a tax system that uses information on the household’s earnings as well as other characteristics like age, financial assets and the household’s composition. The target is to raise the same amount of revenue with the least amount of distortions. I state the problem in terms of an optimal Ramsey problem and discuss the results. Social welfare function The government’s objective is to maximize the ex ante expected lifetime utility of the newborn household at the new steady state. This way the government takes into account both the need for efficiency and insurance. Formally the welfare function can be written as Z SW F = Vz1 (a, x, S−1 )Φz1 (a, x, S−1 ) (32) 16 Note that individuals with higher reservation wages do not necessarily have a larger labor supply elasticity. What matters for the labor supply elasticity is the distance between the offered wage and the reservation wage. 26 where x is equal to the mean productivity and S−1 is {u,u} since both members have to look for a job when they start their lives. The integral is taken over possible types z.17 Ramsey Problem The Ramsey problem is that of maximizing the social welfare function with respect to a given set of policy instruments π. The allocations have to respect the government budget constraint and consist a competitive equilibrium. The problem is written as follows: max SW F (π) s.t. G = τc C(π) + τk rK(π) + π X Z TLi (π). (33) i=SN ,JN In the benchmark economy the available policy instruments π are given by the following equations for single and joint filers ( TLS (W ) = W − (1 − τ0 )W 1−τ1 T L (W, FS) = TLJ (W ) = W − (1 − τ0 )W 1−τ2 where W = ŵm hm + ŵf hf represents the household’s total labor earnings and FS stands for filing status. The main idea of the paper is to examine the potential of a tax system that jointly uses information on age, assets, and filing status. To do so I consider a new set of policy instruments that use the following equations ( TLS (W, j, a) = W − (1 − τ0S (j, a))W 1−τ1 T L (W, a, j, FS) = . TLJ (W, j, a) = W − (1 − τ0J (j, a))W 1−τ2 Here, a represents asset holdings and j the age of household members. The new system differentiates tax rates across households of different age and asset holdings for both single and joint filers. This takes place through the parameter τ0 for which the parametrization (for example for single filers) takes the form τ0S (j, a) = τ00 +τ01 a+τ02 j+τ03 j 2 +(τ04 +τ05 a)j 3 . A similar form is assumed for τ0F . This functional form is designed to capture the differences in elasticities across age and wealth groups for both single and joint filers.18 The problem is solved in two stages. For a given set of tax instruments, I calculate the competitive equilibrium and make sure that the government budget constraint is 17 Our social welfare function corresponds to a Utilitarian view of tax policy. However, as Weinzierl (2014) warns, this objective might not represent the true preferences of the society. I choose to employ this very common welfare criterion as a useful first step in understanding the optimal properties of my tax function and also to facilitate the comparison with similar papers in the literature. 18 Although this tax function allows for a very large degree of flexibility, it is only one of the possible functional forms one could have considered. My choice is guided by the relationship between labor supply elasticity and age-asset holdings found in Section 4. For example, this specification can capture well the nonlinear age-profile of labor supply elasticity. At the same time, since assets are an important determinant of labor supply elasticity I added an interaction term in wealth. This means that different wealth groups will face a different life-cycle profile of taxes. I found that expanding the polynomial or adding more interaction terms with respect to wealth did not deliver any additional welfare gains. 27 satisfied. I then iterate over all possible tax parameters to find the ones that maximize the social welfare function. Average Tax Rates and Age Average Tax Rates and Assets Average Tax Rates and Filing Status 0.4 0.4 0.4 0.3 0.3 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0 0 0 −0.1 −0.1 Benchmark Age 25 Age 45 Age 65 −0.1 −0.2 0 100 200 300 −0.2 0 Benchmark 40,000$ 500,000$ 100 200 Benchmark Single Filing Joint Filing −0.2 300 0 100 200 300 Total Household Earnings (1,000$) Total Household Earnings (1,000$) Total Household Earnings (1,000$) Figure 5: Benchmark and optimal tax system. Left Panel. Average tax rates across age groups. Middle Panel. Average taxes across households with different assets. Right Panel. Average tax rates for households who file a single tax return and a joint tax return. Properties of the Optimal Tax Function This part describes how the tax code should vary with personal characteristics. To make things simpler I describe how the tax rates change between the benchmark and the optimal economy for a household whose total earnings are equal to the mean household earnings in our data (82,204$). The optimal tax code has the following three properties. 1) The tax burden decreases for younger and older households while it increases for middle-aged households. This can be seen in the left panel of Figure 5. The analysis is for a household that files a joint tax return and has assets equal to the mean assets in the economy. At the benchmark economy this household will pay 23.0% of its income in taxes. In the optimal economy a household of age 25 will pay 12.5% of its income in taxes, a household of age 45 will pay 29.5% of its income in taxes, while a a household of age 65 will pay just 9.1% of its income. 2) Households closer to retirement receive an additional tax cut if they have accumulated a significant amount of assets. The analysis is for a household of age 65 that files a joint tax return and can be seen in the middle panel of Figure 5. As before in the benchmark economy, a household with mean earnings pays 23.0% of its income in taxes. In the optimal economy a household with $40,000 in assets will pay 19.4% in taxes, but if this household has $500,000 in assets, it will pay only 7.2% of its income. Note that on average older households pay less, but this decrease is not uniformly distributed. 3) 28 Two-earner households face a smaller tax rate relative to single earner households with the same income. The right panel of Figure 5 compares average tax rates for a household of age 35 with assets equal to the mean assets in the economy. Again, under the benchmark system a household that files jointly pays 23.0% of its income in taxes (the benchmark tax schedule for single filers is almost identical so I do not plot it for simplicity). In the optimal economy the single filers pay 33.1% in taxes while the joint filers 21.5% of their income. Notice that the middle-aged groups pay on average more, but households with two working members actually face a small tax cut. Table 4: Aggregate Effects of Policy Variable TL (W, j) TL (W, a) TL (W, FS) TL (W, j, FS) TL (W, a, j, FS) Capital Labor Consumption Output Wage Rate Interest Rate Consumption Equivalent -1.74% +0.65% +0.54% -0.21% -0.80% +0.18% +0.46% +2.84% +0.85% +0.34% +0.83% +1.01% -0.14% +0.05% +2.82% +1.74% +2.71% +2.13% +0.35% -0.07% +0.48% +1.30% +2.75% +3.77% +2.46% -0.15% +0.04% +0.59% +8.37% +3.17% +4.88% +5.02% +1.75% -0.37% +0.90% Aggregate Effects of the Reform The proposed reform is associated with large gains. Table 4 reports the percentage change in key macro aggregates between the benchmark and the optimal economy (both economies are revenue neutral). Total labor supply as measured in efficiency units increases by 3.17%. Capital also increases by a large 8.37%. This leads to an increase in total output produced by 5.02%. Aggregate consumption also increases by a large amount, namely 4.88%. Even though labor supply increases, the wage rate increases by 1.75%. This happens due to the larger capital stock which makes workers more productive. As a result, the interest rate will be lower by 0.37 percentage points in the new economy. To measure the welfare gains we compute the uniform percentage in consumption at each date and each state needed to make a newborn indifferent between the benchmark and the optimal economy provided that labor effort is the same. If the consumption equivalent is positive, the new economy is preferable since the agent would have to be compensated in order to accept being born in the initial economy. At the new steady state welfare increases by a sizeable amount 0.90% of annual consumption.19 This number is even more significant if one considers that in the new steady state individuals spend 10% more of their time working. Life-Cycle Profiles In Figure 6 we plot life-cycle profiles for average assets, average consumption, average labor income taxes, and total working hours by the household. The 1−β 19 (W −W ) The welfare gains are computed using the formula CEV = e 1−βJ+1 2 1 − 1 where W1 and W2 is the exante welfare of the newborn at the old and the new steady state, respectively. 29 Assets (1,000$) Consumption (1,000$) 1000 100 800 80 600 60 400 40 200 20 0 20 40 60 Age 80 0 100 Labor Income Taxes (1,000$) 40 60 Age 80 100 Total Household Hours 60 1 50 0.8 40 0.6 30 0.4 20 10 0 20 Benchmark Optimal 0.2 30 40 50 60 0 70 Age 30 40 50 60 70 Age Figure 6: Life-cycle profiles: benchmark and optimal economy. Upper Left Panel. Average assets. Upper Right Panel. Average consumption. Lower Left Panel. Average labor income taxes. Lower Right Panel. Total Household Hours. lower-right panel shows the large increase in labor supply. This is driven first by the increase in the number of two-earner households in the new economy. In the benchmark economy 61% of households employ both members, 24% employ a single member, and 15% none. In the optimal economy, 83% of households are two-earner families, only 3% are single-earner, while 13% do not employ any of its members. This dramatic increase in female participation is related to the large labor supply elasticities of this group. The increase in labor supply is also related to individuals (both males and females) delaying their retirement. Participation rates for people between ages 51-65 increase from 75.0% and 55.1% for males and females, respectively, to 76.8% and 76.0% in the new economy. 30 This happens in spite of older households having a larger amount of assets (stronger wealth effects). More importantly, in spite of the heavier tax burden, single-earner middle-aged households do not decrease their labor supply significantly. These results are in line with our findings that middle-aged males feature very low while older households feature very high participation elasticities. The upper-right and the upper-left panels show the increase in capital and consumption, respectively. The increase in assets occurs for three reasons. First, workers delay their retirement and continue to build up their life-cycle savings up to age 65. Second, the optimal tax system decreases tax rates for relatively wealthier households who are close to retirement. This way, the tax code encourages households to keep saving during middle ages to receive the tax credit later. Third, households earn more due to the large increase in female participation and thus can afford saving more for retirement. Higher earnings, higher wage per hour, and larger asset holdings lead to a large increase in consumption. Consumption also increases for retirees. This is because agents enter retirement having on average a much larger stock of savings. At the same time higher labor supply implies a higher Social Security benefit. Decomposition of Efficiency and Welfare Gains To understand better how the policy works, we need to identify the contribution of each variable (age, assets, and filing status) to the total welfare gains. To this end, I examine the potential of a tax system that can depend separately on these characteristics or jointly only on age and filing status. This exercise also highlights how much we could gain if we used simpler policies.20 In the case of age-dependent taxation π = TL (W, j), labor supply increases by 0.65% (Table 4). This is driven by younger and older households increasing their participation by a significant amount. However, the young have less incentive to save in anticipation of lower tax rates closer to retirement. As a result, capital decreases by 1.74%. Output, consumption, and the wage rate also decrease. Overall, welfare increases by 0.46%. A wealth-dependent policy TL (W, a) can increase capital by 2.84% and labor supply by 0.85%. However, the welfare gains are minor, just 0.05%. Newborn households are not favorable to a system that places smaller taxes to wealth-rich taxpayers, independently of their age. In the case of a tax system that uses information on filing status π = TL (W, FS), welfare increases by 0.48%. In this scenario, labor supply increases by 1.74% once again reflecting the tax incentives for females to participate in the labor market. Households use part of their higher earnings to save for retirement. As a result, capital also increases, by 2.82%. The wage rate increases by 0.35% while consumption increases by 2.77%. Making the system age- and household-dependent, TL (W, j, FS), increases labor supply by 2.75% and capital by 1.30%. The welfare gains increase by 0.59% compared to the benchmark economy. Although assets holdings do not add any substantial welfare gains, they can promote welfare significantly if they are part of a system that also uses information on age and 20 The qualitative properties of the optimal tax system in each case is similar to the properties outlined so far. I do not repeat them for simplicity. 31 filing status TL (W, a, j, FS). In this case, wealth-poor households do not pay larger labor income taxes, they just lose the tax cuts promised at people closer to retirement. This policy seems preferable to a policy that increases labor income taxes for wealth-poor households throughout the life cycle. Moreover, since older households with a high amount of assets pay lower taxes, households are encouraged to keep saving until they reach retirement. This way, wealth-dependent policies can also correct the savings distortions created by age-dependent taxation. As a result, in our optimal economy, capital increases by a large 8.37%, which consequently increases the wage by 1.75%. Hence, a large fraction of the welfare gains is linked to the way the tax-tags interact within the optimal policy. Comparison with Guner, Kaygusuz, and Ventura (2012b) In an interesting paper Guner, Kaygusuz, and Ventura examine the potential of an explicit gender-based policy that places a heavier tax burden toward males. Their main finding is that a gender-based tax cannot do better than a gender-neutral proportional tax. Although this paper also considers ways to encourage female labor force participation, I do not resort to explicit gender-based policies. In particular, I recommend for tax cuts to both households members in dual-earner households. To evaluate better the difference between the two policies I implement a gender-based policy in the spirit of Guner, Kaygusuz, and Ventura (2012b): primary earners face a tax schedule TL (ŵm hm ) = ŵm hm − (1 − τ0m )ŵm hm 1−τi 1−τ while secondary earners a schedule TL (ŵf hf ) = ŵf hf − (1 − τ0f )ŵf hf 1 . 21 To facilitate the comparison I set the tax rate for females equal to the tax rate found at the optimal filing policy TL (W, FS) and adjust the tax rate for males to adjust the government balance budget. I find the two policies to have different implications. Tagging filing status increases labor supply by 1.74% and capital by 2.82% while tagging gender increases labor supply by 0.97% and decreases capital by a small 0.43%. The former policy incentivizes females to work more not only because they can pay lower taxes but also because their husband will pay lower taxes. This encourages even low productivity females to join the workforce. Larger earnings allow households to save more for retirement. In contrast, the gender-based policy increases taxes on the primary earner which lowers the efficiency gains. More remarkably, the welfare gains in the two cases are strikingly different. Tagging filing status increases welfare by 0.48%, while using a similar gender-based policy decreases welfare by a large 1.47%. This can be partially explained by the wage being lower in the gender-based policy. A case for a less progressive tax system? It is of interest to understand what the optimal tax code would look like if we had used tax instruments that are currently part of the tax system, like a nonlinear labor income tax schedule. To do so I regress labor income taxes paid in the optimal economy on household income and household income 21 The parameter τi takes different values depending on whether the female is working or not. 32 12 Labor Income Taxes 10 Current US Optimal 8 6 4 2 0 −2 0 5 10 15 20 25 Household Income Figure 7: Approximated labor income taxes paid as a function of Household’s Income. Benchmark economy and Optimal Economy. squared also found at the optimal economy. I repeat this exercise at the benchmark economy and plot the results in Figure 7. A function that distorts less high-income households seems to be the best approximate of our optimal tax system. The purpose of this calculation is to highlight the critical role of labor supply elasticity in the heated debate over the the progressivity of the income tax schedule. Smaller distortions at the top might be optimal because a) workers will retire at a later day, b) high-income household are much more sensitive to wage fluctuations since they can always use their assets to self-insure, and c) they can encourage households to employ both members.22 6 Extensions In this section I consider different versions of the model that incorporate i) a constant elasticity of labor supply, and ii) endogenous human capital accumulation. For both exercises I calculate the change in the aggregates as well as the welfare gains by changing the tax code to the optimal tax system found in our benchmark model.23 Our findings highlight the crucial role of heterogeneity in labor supply elasticity to generate welfare gains. In contrast, I find that omitting endogenous human capital from the analysis has 22 Conesa and Krueger (2006) build a life-cycle model to investigate the optimal progressivity of the income tax schedule. They find that the optimal tax code is well approximated by a proportional income tax with a fixed deduction. 23 Re-calculating the optimal tax system would complicate the analysis by a very large degree especially in the model that incorporates human capital accumulation. Hence, as a useful first step to understand the importance of each element, I choose to compare gains between the benchmark model and the new economies under the optimal tax code as found in our benchmark specification. 33 Table 5: Different Model Specifications Variable Capital Labor Consumption Output Wage Rate Interest Rate Consumption Equivalent Benchmark Constant elasticity Human Capital +8.37% +3.17% +4.88% +5.02% +1.75% -0.37% +0.90% -1.85% +0.75% -0.20% -0.17% -0.81% +0.17% -0.80% +8.01% +3.50% +4.91% +5.10% +1.52% -0.32% +0.98% minor implications. Constant elasticity of labor supply Given the complicated nature of our tax instruments one may wonder if heterogeneity in labor supply elasticity is the main driver of our welfare gains. To explore this issue I use a model with divisible labor and a Frisch utility function 1+ 1 hj γ . U = log cj + ψ 1 + γ1 (34) In this economy labor supply elasticity is the same across agents and is given by the parameter γ. I use a value of 1.41 equal to the average value of labor supply elasticity found in our benchmark economy. To check whether heterogeneity matters I calculate how much we can gain in the constant elasticity model (CEM) by changing the tax code to the optimal tax system found in our benchmark model. If heterogeneity does not matter we should expect to find welfare and efficiency gains of the same magnitude. It turns out that the exact same policy in the constant elasticity model decreases welfare by -0.80% (Table 5). In this case labor supply increases only by 0.75%, while capital and consumption decrease by 1.85% and 0.20%, respectively. For example, in the CEM average working hours for people between 21-35 increase by +0.15% (due to tax cuts), decreases by -1.20% for people between 36-50 (due to higher taxes for this group), and increases by +1.30% for people between 51-65 (again due to tax cuts). In the benchmark model with heterogeneous elasticities these numbers are +6.70%, +2.41% and +6.11%, respectively. Notice that in the CEM the increase in working hours for younger and older workers is almost matched by the decrease in the working hours of middle-aged groups who face the tax increases. In contrast, in the heterogeneous elasticity model the tax cuts generate much larger efficiency gains as they are targeted toward groups with a larger labor supply elasticity. Human Capital Accumulation We extend our basic model to incorporate endoge34 nous human capital accumulation. To simplify the analysis we introduce human capital only for the secondary earner. Since the participation rates are very high for primary earners we expect human capital to have small effects for this group. In the benchmark model age-dependent productivity log fj evolves exogenously. Here, I assume that log fj = χ0 log(χ1 +κj ) with κj = (1−δh )κj−1 +I{employed at j − 1} . I is an indicator function that takes the value of 1 if the worker was employed in the previous period. In this case workers take into account that staying employed can increase their wage next year. I set δh = 0.11 based on Blundell, Costa-Dias, Meghir, and Shaw (2013) and calibrate χ0 , χ1 to match as close as possible the position and slope, respectively, of the age-profile of wages estimated from the PSID. Although, the reform causes welfare to increase slightly compared to our benchmark, the difference does not seem significant. The difference is related to middle-aged households having higher wages as a response to their decision to enter the labor market at an earlier stage. One reason for the small difference is the presence of search costs. Search costs induce a utility loss to unemployed individuals if they decide to return to the labor market. This is similar to the (monetary) loss individuals face due to human capital depreciation. Hence, since the benchmark model already captures some of the frictions related to moving between employment and unemployment (through the presence of the search cost), adding endogenous human capital has small effects on the final results. 7 Conclusion This paper evaluates the quantitative potential of a tax system that depends on a rich set of household characteristics, such as the person’s age, his/her financial assets, and the number of working members in his/her household. The justification for this kind of reform is that workers respond differently to wage changes depending on how close they are to retirement, how wealthy they are, and whether they are the main financial provider in the family. I find that middle-aged households are much more likely to stay employed in the face of a tax increase compared to younger households and households closer to retirement. At the same time, a worker in a single-earner household is not as sensitive to tax increases as a worker who is the secondary earner in a family. The optimal system increases taxes for middle-aged households with only a single earner, while it decreases tax rates for younger and older households and especially those with two working members. The gains from the reform turn out to be large. Labor supply increases by 3.17%, capital by 8.37%, and consumption by 4.88%. Welfare increases by 0.90% in terms of consumption equivalent variation. A decomposition shows that the interaction between policy variables is a crucial determinant of the overall gains. Approximating the optimal tax system by a standard nonlinear tax function, I find that smaller distortions for high-income individuals give a closer approximation to the optimal tax system compared to the current U.S. system. 35 References Albanesi, S., and Sleet, C. (2006). “Dynamic optimal taxation with private information”. Review of Economic Studies, 47 (1), 1-27. Alexopoulos, M., and Gladden, T. (2002). “Wealth, reservation wages and labor market transitions in the U.S.: Evidence from the SIPP”. Working Paper, University of Toronto, 47 (1), 1-27. Attanasio, O., Low, H., and Sanchez-Marcos, V. (2008). “Explaining changes in female labor supply in a life cycle model”. American Economic Review , 98 (4), 1517-1552. Blau, F. D., and Kahn, J. (2000). “Gender differences in pay”. The Journal of Economic Perspectives, 14 (4), 75-99. Blundell, R., Costa-Dias, M., Meghir, C., and Shaw, J. (2013). “Female labour supply, human capital and welfare reform”. NBER Working Paper, No. 19007 . Chang, Y., and Kim, S. (2006). “From individual to aggregate labor supply: a quantitative analysis based on a heterogeneous agent macroeconomy”. International Economic Review , 47 (1), 1-27. Conesa, J. C., Kitao, S., and Krueger, D. (2009). “Taxing capital? Not a bad idea after all!”. American Economic Review , 99 (1), 25-48. Conesa, J. C., and Krueger, D. (2006). “On the optimal progressivity of the income tax code”. Journal of Monetary Economics, 53 (7), 1425-1450. Domeij, D., and Floden, M. (2006). “The labor supply elasticity and borrowing constraints: Why estimates are biased”. Review of Economic Dynamics, 9 (2), 242-262. Erosa, A., Fuster, L., and Kambourov, G. (2013). “Towards a micro-founded theory of aggregate labor supply”. Working Paper, University of Toronto. Erosa, A., and Gervais, M. (2002). “Optimal taxation in life cycle economies”. Journal of Economic Theory, 105 (2), 338-369. Farhi, E., and Werning, I. (2013). “Insurance and taxation over the life cycle”. Review of Economic Studies, 80 (2), 596-635. French, E. (2005). “The effects of health, wealth, and wages on labor supply and retirement behavior”. Review of Economic Studies, 72 (2), 395-427. Gourio, F., and Noual, P. (2009). “The marginal worker and the aggregate elasticity of labor supply”. Working Paper, Boston University. 36 Guner, N., Kaygusuz, R., and Ventura, G. (2012a). “Taxation and household labour supply”. Review of Economic Studies, 79 (3), 987-1020. Guner, N., Kaygusuz, R., and Ventura, G. (2012b). “Taxing women: A macroeconomic analysis”. Journal of Monetary Economics, 59 (1), 111-128. Hansen, G. (1985). “Indivisible labor and the business cycle”. Journal of Monetary Economics, 16 (3), 309-327. Heathcote, J., Storesletten, K., and Violante, G. (2014). “Optimal tax progressivity: An analytical framework”. NBER Working Paper, No 19899 . Imrohoroglu, S., and Kitao, S. (2012). “Social security reforms, benefit claiming, labor force participation and long run sustainability”. American Economic Journal: Macroeconomics, 4 (3), 96-127. Jaimovich, N., and Siu, H. E. (2009). “The young, the old, and the restless: Demographics and business cycle volatility”. American Economic Review , 99 (3), 804-826. Keane, M. P. (2011). “Labor supply and taxes: A survey”. Journal of Economic Literature, 49 (4), 961-1075. Kimmel, J., and Kniesner, T. (1998). “New evidence on labor supply: Employment versus hours elasticties by sex and marital status”. Journal of Monetary Economics, 42 (2), 289-301. Kitao, S. (2010). “Labor-dependent capital income taxation. Journal of Monetary Economics, 57 (8), 959-974. Kocherlakota, N. (2005). “Zero expected wealth taxes: A mirrlees approach to dynamic optimal taxation”. Econometrica, 73 (5), 1587-1621. Low, W. H. (2005). “Self insurance in a life-cycle model of labour supply and savings”. Review of Economic Dynamics, 8 (4), 945-975. MaCurdy, T. (1981). “An empirical model of labor supply in a life cycle setting”. Journal of Political Economy, 89 (6), 1059-1085. Pencavel, J. (1998). “Assortative matching by schooling and the work behavior of wives and husbands”. American Economic Review , 88 (2), 326-329. Pijoan-Mas, J. (2006). “Precautionary savings or working longer hours?”. Review of Economic Dynamics, 9 (2), 326-352. Pistaferri, L. (2003). “Anticipated and unanticipated wage changes, wage risk, and intertemporal labor supply”. Journal of Labor Economics, 21 (3), 729-754. 37 Prescott, E., Rogerson, R., and Wallenius, J. (2009). “Lifetime aggregate labor supply with endogenous workweek length”. Review of Economic Dynamics, 12 (1), 23-36. Rogerson, R. (1988). “Indivisible labor, lotteries and equilibrium”. Journal of Monetary Economics, 21 (1), 3-16. Rogerson, R., and Wallenius, J. (2009). “Micro and macro elasticities in a life cycle model with taxes”. Journal of Economic Theory, 144 (6), 2277-2292. Storesletten, K., Telmer, I. C., and Yaron, A. (2004). “Consumption and risk sharing over the life cycle”. Journal of Monetary Economics, 51 (3), 609-633. Tauchen, G. (1986). “Finite state markov-chain approximations to univariate and vector autoregressions”. Economics Letters, 20 , 177-181. Weinzierl, M. (2011). “The surprising power of age-dependent taxes”. Review Economic Studies, 78 (4), 1490-1518. Weinzierl, M. (2014). “The promise of positive optimal taxation”. Forthcoming, Journal of Public Economics. 38 Appendix A. PSID and Data Restrictions I use data from the PSID and use a wide range of waves from 1970 to 2005. The survey was conducted annually up to 1997 and biannually from 1999 to 2005. For each year data are collected for both the head of the household and the “wife” of the household. These are the the total amount of hours supplied, their annual labor income as well as their sex. For hours I use the variables ”Head Annual Hours of Work” and ”Wife Annual Hours of Work”. These variables represent the total annual work hours on all jobs including overtime. For the labor income the variables ”Head Wage” and “Wife Wage” which includes wages and salaries. Households with a single female primary earner are excluded from the analysis. The measure of wealth is the variable WEALTH2 as found in specific waves of PSID. This variable is constructed as sum of values of several asset types (family farm business, family accounts, assets, stocks, houses and other real estate etc.) net of debt value. B. CPS and Tax Estimates To estimate the progressivity of the US tax schedule I use data from the CPS for the period 1992-2005. In particular I gather information for the individual’s annual working hours (usual weekly hours × weeks worked), family income, marital status, type of filing (nonfiler, single, joint) and marginal tax rates. The sample is restricted to people who work between 800 and 5200 hours, who report positive family income and who are between the age of 21 to 70. I estimate two separate equations one for married individuals who file jointly and one for single filers independently of whether they are married or not. To do so I use the following regression: log(1 − marginal tax rates) = β0 + β1 log(labor earnings) To see how this regression is derived denote le = ŵm hm + ŵf hf as households’s total labor earnings and 1−τ note that the tax function is given by T L (le) = le − (1 − τ0 )(le) 1 . Differentiating we get L −τ1 T 0 (le) = 1 − (1 − τ0 )(1 − τ1 )(le) L −τ1 → 1 − T 0 (le) = (1 − τ0 )(1 − τ1 )(le) L → L log(1 − T 0 (le)) = log(1 − τ0 ) + log(1 − τ1 ) − τ1 log (le) → log(1 − T 0 (le)) = β0 + β1 log (le) So by regressing marginal tax rates on family income we can identify the progressivity parameter τ1 (and τ2 ). As mentioned in the main text the estimates are τ1 = 0.073 while τ2 = 0.065. C. Solution Algorithm (Benchmark) This is a general equilibrium problem. We are looking for market prices {w, r} which clear the markets and transfers T r that are equal to the total amount of savings by the deceased. To solve this problem we start by guessing prices w0 , r0 and transfers T r0 . The dynamic program is solved by backwards induction. 1. Grid Construction: A grid of 150 points is specified for the assets making sure that the upper bound is large enough. More grid points are assigned to lower values. The continuous process of 39 transitory labor income shock x is discretized into a six state Markov chain using the methodology σ2 η described by Taunchen (1986). The unconditional variance of the process is equal to σx2 = 1−ρ 2. I set the grid’s bounds to [−λσx , λσx ] and λ = 1.2 × log(6) and divide the space into 6 equally distanced points. The corresponding transition matrix is 0 Q(η | η) = 0.910 0.035 0.000 0.000 0.000 0.000 0.088 0.893 0.044 0.000 0.000 0.000 0.000 0.071 0.898 0.056 0.000 0.000 0.000 0.000 0.056 0.898 0.071 0.000 0.000 0.000 0.000 0.044 0.893 0.088 0.000 0.000 0.000 0.000 0.035 0.910 The transition process implies an invariant distribution equal to Π? = [0.066, 0.1675, 0.265, 0.265, 0.167, 0.066]. Lastly I transform the grid into consumption units by taking the exponential and I normalize by using the invariant distribution. The grid used in the simulation is the following: x = [0.117, 0.236, 0.474, 0.950, 1.905, 3.820]. The permanent component of labor income log z is distributed normally with mean zero and variance σz2 and divide the space into 4 equally distanced grid points. The grid bounds are equal to three standard deviations which gives log z = [−1.558, −0.519, 0.519, 1.558]. 2. Guessing prices: The first step is to guess a set of firm inputs Kd0 , L0d . Using the first order conditions these imply a set of prices {w, r}. We also guess a value for transfers T r0 . 3. Solving for the Retirees: The problem is solved by backwards induction. Using that a081 = 0 we can easily back out the value function V81 (a). To find V80 (a) I solve a one dimensional optimization problem over a0 . I use golden search and spline interpolation to approximate the value function for out of the grid points. Using this method we can get a series of value functions {Vj (a)}81 j=66 and policy functions {g a (a)}81 j=66 . 4. Solving for Workers: The problem for working cohorts requires calculating three different value functions VjEE , VjEU , VjU . To calculate VjEU we need to optimize over both a0 and h. I proceed as follows: for every state vector and potential savings choice a0 , I use bisection to solve the static L0 (ŵh)) first order condition ψ(1 − h)−θ = ŵ(1−T to get h(a0 ; ω). The problem is now reduced into c(1+τc ) a one dimensional problem. Finding gja (ω) allows to back out gjh (ω). Using both we can find the value VjEU (ω). For the value V EE I use the same method and use that h̄ = 0.34. Lastly, the value for the unemployed VjU is easier to obtain since it requires a one optimization problem. Participation is found by comparing the three functions: Vj = max {VjEE , VjEU , VjU }. Using this method we can get a series of value functions {VjEE (ω), VjEU , VjU (ω), Vj (ω)}65 j=21 and policy a h 65 functions {g (ω), g (ω)}j=21 . 5. Simulation: At this stage I generate a cross section of 5,000 individuals and track them over their lifetime. Exogenous variables (productivity) evolve based on the Markov process. Endogenous variables are consistent with the decision rules. Aggregating gives K s , Ls and T r. 6. The new guess is found by Kd1 = χKd0 +(1−χ)Ks , L1d = χL0d +(1−χ)Ls and T r1 = χT r0 +(1−χ)T r. To guarantee convergence I set χ very close to 1. Using the new guesses I go back and solve the problem again. This process stops when all our guesses are sufficiently accurate. 40 D. Female Labor Supply and Cohort Effects Female Participation across Cohorts Female Participation net of Cohort Effects 1 1 0.8 0.6 0.8 1970 1960 0.6 1950 0.4 0.4 1940 1930 0.2 0 20 0.2 1920 30 40 50 60 70 Age 0 20 30 40 50 60 70 Age Figure A1: Left Panel. Labor participation rates for females from the PSID across cohorts. Right Panel. Labor participation rates for females net of cohort effects. Female labor force participation has been steadily increasing over the last decades. In the left panel of Figure A1 I follow different cohorts of females over time and calculate the average participation rate for the specific cohort. So the line that corresponds to 1920 in the figure focuses on females born between 1920-1930 and reports the average labor force participation rate for each cohort. Since we have data for the period 1970-2005 we can only observe the behavior of this cohort only for ages after 50. Similarly, the line corresponding to 1930 has information on people born between 1930-1940. For this cohort we can observe the behavior of people for ages after 30. For cohorts after 1960 we have a similar problem since we cannot observe the behavior for people after the age of 45-50. We can see that female labor force participation has been increasing over the past decades. Also the peak of each profile occurs at an earlier age meaning that females in recent cohorts prefer to enter sooner the labor market than later. To find the average participation rates for each age would mean we would have to use females from different cohorts which might bias our estimates. To separate the age from the cohort effects I use the following strategy. For females born after 1950 I calculate the average participation using all females in the PSID who are younger than the age of 42 (after this age there are too few observations). Looking at the cohorts 1950, 1960 and 1970 the cohort effect seems to diminish so 1950 seems a suitable year threshold. The results can be seen in the first part of the broken line in the right panel of Figure A1. For age groups 43 and onwards I run an age cohort dummy regression using cohorts before 1950. I use the cohort effect of the latest cohort (1940) and plot the age effects in the right panel of Figure A1. The profile is declining at a fast rate mimicking the behavior of all cohorts but starts from a higher point as we have used the cohort effect of cohort 1940. The right panel smooths these two profiles by using a polynomial of the third degree. This is the profile matched in Figure 3. 41 E. Parameter Values Table 6: Externally set parameters Parameter J jR n α θ τss τc τk τ1 τ2 {j }m {j }f {sj } Description Length of lifetime Retirement age Population growth Technology parameter Preference parameter Social security tax Consumption tax Capital tax Labor income tax parameter Labor income tax parameter Life cycle productivity (primary) Life cycle productivity (secondary) Conditional survival probabilities Value 81 45 1.1% 0.36 2 0.106 0.05 0.30 0.073 0.065 Figure A2 Figure A2 – Reference Standard Standard US long-run average Capital share EFK (2010) Kitao (2010) Imrohoroglu and Kitao (2012) Imrohoroglu and Kitao (2012) CPS CPS PSID PSID Social security admin. (2005) Table 7: Parameters Set within the Model Parameter β δ f c1 f c2 f c3 {γim }4i=0 {γif }4i=0 λ η0m η1m η0f η1f τ0 σz2 ρ ση2 Description Discount factor Depreciation rate Fixed cost males Fixed cost males Fixed cost males Utility cost males Utility cost females Probability of separation Search cost parameter males Search cost parameter males Search cost parameter females Search cost parameter females Labor income tax parameter Variance of permanent shock Persistence of AR(1) Variance of AR(1) 42 Value 0.99 0.0816 0.04 0.04 0.032 Figure A3 Figure A3 0.025 16.5 −0.36 0.08 −0.0006 0.23 0.27 ρ = 0.965 0.045 Target K/Y = 3.2 I/Y = 0.25 Employment21−35 = 0.92 Employment36−50 = 0.80 Employment51−65 = 0.75 Average Hours Profile Average Female Participation pm (E → U ) = 0.055 pm (U → E)21−42 = 0.44 pm (U → E)45−65 = 0.16 pf (U → E)21−42 = 0.18 pf (U → E)43−65 = 0.09 G/Y = 0.2 Var(y21 ) = 0.27 Linear Slope of profile Var(y60 ) = 0.9 Wage Profile Utility Cost 1.5 0.65 Primary Earner Secondary Earner 1.4 0.6 0.55 1.3 0.5 1.2 0.45 1.1 0.4 1 0.35 0.9 0.8 20 0.3 30 40 50 60 0.25 20 70 Age 30 40 50 60 70 Age Figure A2: Left Panel. Life-cycle wage profiles for the primary and the secondary earner. Right Panel. Utility cost of working for the primary and the secondary earner. F. Selection Effects The path of average wages for the secondary earner was computed based on a sample of working females during our sample period. Since selection into employment is not random, we should not expect this path to reflect the actual wage offered to the average female (true productivity). Therefore, it is informative to check whether our approach generates some kind of discrepancy between observed statistics in the model and in the data. To do so I compare the female to male average earnings ratio from the data (PSID) and from the calibrated model for workers who decide to participate. Observed earnings in the PSID reflect the decision to participate based on the true wage offered. Earnings generated in the model reflect the decision to participate based on the potentially biased wage process taken from a selected sample. The difference between the two can capture the magnitude of selection bias. In the data the female to male average earnings ratio is equal to 0.558, while in the model the statistic is slightly higher and equal to 0.573. Hence, our choice of parameters does not seem to generate too much discrepancy. G. Model without age-dependent preference parameters Our benchmark model can match very well a wide range of statistics for both males and females like i) the inverse U-shaped profile of employment, ii) the decreasing probability of moving from unemployment to employment along the life cycle, and iii) the nonmonotonic relationship between labor market participation and asset holdings. One may wonder how well the model can perform if we did not allow for any age-dependent parameters in the calibration. To check this I re-calibrate a “small-scale” version of the model using the following age-independent parameters: F C, ψ f , scm , scf . Figure A3 plots the data, the model under our benchmark parametrization, and the model under our parsimonious calibration noted as “small-scale” model. In spite of the minimal structure, the model can still capture all the basic features of the labor market. Hence, the age-dependent preference parameters help us refine our results, not force the model to match the data. This exercise highlights the strength 43 of the endogenous mechanics in the model. This also brings confidence that the model is flexible enough to capture realistically the effects of the policy reforms. Participation Rate (Primary Earner) Participation Rate (Secondary Earner) 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.3 0.2 0.1 20 0.4 PSID Benchmark Small Scale 30 40 50 0.3 0.2 60 0.1 20 70 30 40 Age 50 60 70 Age Unemployment to Employment (Primary Earner) Unemployment to Employment (Secondary Earner) 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 20 30 40 50 60 0 20 70 Age 30 40 50 60 70 Age Figure A3: Life-cycle profiles: PSID, Benchmark model and Small-Scale model. Upper Left Panel. Participation Rate for Primary Earner. Upper Right Panel. Participation Rate for Secondary Earner. Lower Left Panel. Unemployment to Employment Transition Rate (Primary Earner). Lower Right Panel. Unemployment to Employment Transition Rate (Secondary Earner). 44