The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
Federal Reserve Bank of Chicago Risk-Adjusted Capital Allocation and Misallocation Joel M. David, Lukas Schmid, and David Zeke December 18, 2020 WP 2020-34 https://doi.org/10.21033/wp-2020-34 * Working papers are not edited, and all opinions and errors are the responsibility of the author(s). The views expressed do not necessarily reflect the views of the Federal Reserve Bank of Chicago or the Federal Reserve System. Risk-Adjusted Capital Allocation and Misallocation∗ Joel M. David† Lukas Schmid‡ David Zeke§ FRB Chicago USC Marshall USC December 18, 2020 Abstract We develop a theory linking “misallocation,” i.e., dispersion in marginal products of capital (MPK), to macroeconomic risk. Dispersion in MPK depends on (i) heterogeneity in firm-level risk premia and (ii) the price of risk, and thus is countercyclical. We document strong empirical support for these predictions. Stock market-based measures of risk premia imply that risk considerations explain about 30% of observed MPK dispersion among US firms and rationalize a large persistent component in firm-level MPK. Risk-based MPK dispersion, although not prima facie inefficient, lowers long-run aggregate productivity by as much as 6%, suggesting large “productivity costs” of business cycles. JEL Classifications: D24, D25, E22, E32, G12, O47 Keywords: misallocation, productivity, costs of business cycles, risk premia ∗ We thank Andy Atkeson, David Baqaee, Frederico Belo, Harjoat Bhamra, Vasco Carvalho, Gian Luca Clementi, Wei Cui, Greg Duffee, Andrea Eisfeldt, Emmanuel Farhi, Brent Glover, Francois Gourio, John Haltiwanger, Nir Jaimovich, Şebnem Kalemli-Özcan, Matthias Kehrig, Pete Klenow, Leonid Kogan, Deborah Lucas, Ellen McGrattan, Ben Moll, Stijn Van Nieuwerburgh, Ezra Oberfield, Christian Opp, Stavros Panageas, Dimitris Papanikolaou, Adriano Rampini, Diego Restuccia, John Shea, Jules van Binsbergen, Venky Venkateswaran, Neng Wang, Amir Yaron and many seminar and conference participants for helpful comments and suggestions. The views expressed here are those of the authors and not necessarily those of the Federal Reserve Bank of Chicago or the Federal Reserve System. † joel.david@chi.frb.org. ‡ lukas@marshall.usc.edu. § zeke@usc.edu. 1 Introduction A large and growing body of work has documented the “misallocation” of resources across firms, i.e., dispersion in the marginal product of inputs into production, and the resulting adverse effects on aggregate outcomes, such as productivity and output. Recent studies have found that even after accounting for a host of leading candidates – for example, adjustment costs, financial frictions or imperfect information – a substantial portion of observed misallocation seems to stem from other firm-specific factors, specifically, of a type that are orthogonal to firm fundamentals and are extremely persistent (if not permanent) to the firm.1 Identifying exactly what – if any – underlying economic forces lead to this type of distortion has proven puzzling. In this paper, we propose and quantitatively evaluate just such a theory, linking capital misallocation to macroeconomic risks. To the best of our knowledge, we are the first to make the connection between standard notions of the risk-return tradeoff and the resulting dispersion in the marginal product of capital (MPK). Indeed, our framework provides a natural way to translate firm-level financial market outcomes into the implications for the allocation of capital across firms. Further, we are able to quantify the effects of risk considerations – i.e., dispersion in risk premia and the extent of aggregate volatility (and so aggregate risk) – on macroeconomic variables, such as aggregate total factor productivity (TFP). Through the marginal product dispersion they induce, risk premium effects – though not prima facie inefficient – depress the achieved level of TFP, leading to a previously unexplored “productivity cost” of business cycles in the spirit of Lucas (1987).2 Our point of departure is a standard neoclassical model of firm investment in the face of both aggregate and idiosyncratic uncertainty. Firms discount future payoffs using a stochastic discount factor that is also a function of aggregate conditions. Critically, this setup implies that firms optimally equalize not necessarily MPK, but rather expected, appropriately discounted, MPK. With little more structure than this, the framework gives rise to a sharp condition governing the firm’s expected MPK – firms with higher exposure to aggregate risk require a higher risk premium on investments, which translates into a higher expected MPK. In fact, the model implies an asset pricing equation of exactly the same form that is often used to price the cross-section of stock market returns. The equation simply states that a firm’s expected MPK should be linked to the exposure of its MPK to aggregate risk (i.e., the firm’s “beta”), and the “price” of that risk. This firm-specific risk premium appears exactly as what would otherwise be labeled a persistent distortion or “wedge” in the firm’s investment decision. We use a combination of firm-level production and stock market data to empirically in1 See, e.g., David and Venkateswaran (2019). We discuss the literature in more detail below. Our analysis is also reminiscent of the approach in Alvarez and Jermann (2004), who use data on asset prices to measure the welfare costs of aggregate fluctuations. 2 2 vestigate and verify two key implications of this simple framework: (i) firm-level exposure to aggregate risk, measured using standard risk factors priced in financial markets, is an important determinant of expected MPK and (ii) MPK dispersion is increasing in the price of risk, measured using common proxies such as credit spreads and the aggregate price/dividend ratio. Because the price of risk is countercyclical, this link introduces a countercyclical element into MPK dispersion as observed in the data. We use the empirical results to perform a back-ofthe-envelope calculation that points to an significant role of risk effects in generating MPK dispersion. Intuitively, the calculation relies on the facts that (i) dispersion in risk premia – readily measured using data on expected stock market returns – is large and (ii) regression estimates yield a sizable elasticity of firm-level MPK to expected returns. After establishing these empirical results, we interpret them and gauge their magnitudes through the lens of a quantitative model. To that end, we enrich our theory by explicitly linking the sources of uncertainty to idiosyncratic and aggregate productivity risk.3 We add two key elements to this framework: (i) a stochastic discount factor designed to match standard asset pricing facts and (ii) ex-ante cross-sectional heterogeneity in firm exposure, i.e., beta, with respect to the aggregate productivity shock. The profitability (e.g., productivity or demand) of high beta firms is highly sensitive to the realization of aggregate productivity (which captures the state of the business cycle), low beta firms have low sensitivity, and indeed, the profitability of firms with negative beta may move countercyclically. The investment side of the model is analytically tractable and yields sharp characterizations of firm investment decisions and MPK. This setup is consistent with the key empirical results described above, namely, firm-level expected MPKs depend on exposures to the aggregate productivity shock (the systemic risk factor in the economy) and due to the countercyclical nature of the price of risk, the crosssectional dispersion in expected MPK is also countercyclical. Further, we derive an expression for aggregate TFP, which is a strictly decreasing function of MPK dispersion. By inducing MPK dispersion, cross-sectional variation in exposure to aggregate risk and a higher price of risk (which depends on the degree of aggregate volatility) reduce the long-run (average) level of achieved TFP. Thus, the model provides a novel, quantifiable link between financial market conditions, i.e., the nature of aggregate risk, and longer-run macroeconomic performance. The strength of these connections relies on three key parameters – the degree of heterogeneity in firm-level risk exposures and the magnitude and time-series variation in the price of risk. We devise an empirical strategy to identify these parameters using salient moments from firm-level and aggregate stock market data, specifically, (i) the cross-sectional dispersion in expected stock returns, (ii) the market equity premium and (iii) the market Sharpe ratio. We use a linearized 3 These can also be interpreted as shocks to demand. Later, we show that the environment can be extended to incorporate multiple risk factors and financial shocks. 3 version of our model to derive analytical expressions for these moments and show that they are tightly linked to the structural parameters. The latter two pin down the level and volatility of the price of risk and the first identifies the cross-sectional dispersion in firm-level risk exposures. Indeed, in some simple cases of our model, the dispersion in expected MPK coming from risk premium effects is directly proportional to the dispersion in expected stock returns – intuitively, both of these moments are determined by cross-sectional variation in betas. Before quantitatively evaluating this mechanism, we add other investment frictions to the environment, specifically, capital adjustment costs. Although they do not change the main insights from the simpler model, we uncover an important interaction between these costs and risk premia – namely, adjustment costs amplify the effects of beta variation on MPK dispersion. Intuitively, adjustment costs generate a second source of co-movement of firm outcomes with aggregate conditions – i.e., fluctuations in the value of installed capital – and hence add an additional component to the risk premium. Because the value of capital turns out to be more procyclical for high beta firms, the risk premium is in turn higher for these firms. On their own, adjustment costs have only transitory effects and do not lead to persistent dispersion in firm-level MPK. However, they can augment the effects of other factors that do, such as the heterogeneity in risk premia we analyze here. We apply our methodology to data on US publicly traded firms from Compustat/CRSP. Our estimates reveal substantial variation in firm-level betas and a sizable price of risk – together, these imply a significant amount of risk-induced MPK dispersion. For example, our results suggest risk premium effects can explain as much as 30% of total observed MPK dispersion. Importantly, this dispersion is largely due to persistent MPK deviations at the firm-level, exactly of the type that compose a large portion of measured misallocation. Indeed, risk effects can account for as much as 47% of the permanent component in the data. The implications of these findings for the long-run level of aggregate TFP are significant – cross-sectional variation in risk reduces TFP by as much as 6%. Note that this represents a quantitative estimate of the impact of financial market outcomes on macroeconomic performance and further, a new connection between the nature of business cycle volatility and long-run outcomes in the spirit of Lucas (1987). Here, higher aggregate volatility leads to greater aggregate risk, increasing dispersion in required rates of return and MPK and thus reducing TFP. Our results suggest these “productivity costs” of business cycles may be substantial. Our estimates also imply a significant countercyclical element in expected MPK dispersion. For example, the parameterized model produces a correlation between the cross-sectional variance in expected MPK and the state of the business cycle (measured by the aggregate productivity shock) of -0.31. To put this number in context, the correlation between MPK dispersion and aggregate productivity in the data is about -0.27. This result provides a risk-based 4 explanation for the puzzling observation, made forcefully by Eisfeldt and Rampini (2006), that capital reallocation is procyclical, despite the apparently countercyclical gains – due to the countercyclical nature of the price of risk and high beta of high MPK firms, such reallocation in downturns would require capital to flow to the riskiest of firms in the riskiest of times. We pursue two important extensions of our baseline analysis. First, we add a flexible class of firm-specific “distortions” of the type that have been emphasized in the misallocation literature. These distortions can be fixed or time-varying and may be correlated or uncorrelated with firmlevel characteristics (including betas) and with the state of the business cycle. We show that to a first-order approximation, our identification strategy and results are either unaffected by these distortions or are likely conservative (depending on the exact correlation structure of the distortions). These findings highlight an important feature of our empirical approach: although observed misallocation may stem from a wide variety of sources, our approach to measuring risk premium effects yields a robust estimate of the contribution of this one source alone. Second, we provide further, direct evidence on the extent of beta dispersion. Rather than relying on stock market data, we compute firm-level betas using production-side data by estimating time-series regressions of firm-level productivity on measures of aggregate productivity. The beta is the coefficient from this regression. This approach yields beta dispersion on par with the dispersion implied by the cross-section of stock market returns. Why do firms (within an industry) have different exposure to the business cycle? Although our analysis does not require us to take a stand on this question, we explore a number of potential explanations. First, we investigate heterogeneity in production technologies (i.e., input elasticities) and markups. We show that these types of heterogeneity can indeed lead to variation in firm responsiveness/exposure to shocks but at most are likely to account for about 12% and 6% of the estimated standard deviation of betas, respectively. Although nonnegligible, these results suggest that the majority of beta dispersion stems from other sources. Next, we show that theories of “trading down” over the business cycle as in, e.g., Jaimovich et al. (2019), may be a promising explanation. In times of economic expansion, when purchasing power is high, consumers tend to substitute towards higher quality goods while in downturns they substitute towards lower quality ones. Thus, higher quality products are more procyclical and lower quality ones less so. Although systematically quantifying this channel is challenging (such an analysis would require comprehensive product quality data), we provide evidence from a single industry – eating places – where we were able to obtain a proxy for quality, namely, average price. Low-price establishments tend to have lower betas than high-price ones and further, average price is positively related to betas, expected returns and MPK. This case study of a single industry helps illustrate the main relationships implied by our theory and suggests quality differentiation may be an important factor behind differences in firm cyclicality. 5 Related Literature. Our paper relates to several branches of literature. Foremost is the large body of work investigating resource misallocation, seminal examples of which include Hsieh and Klenow (2009) and Restuccia and Rogerson (2008). A number of recent papers have explored the role of financial frictions, for example, Midrigan and Xu (2014), Moll (2014) and Buera et al. (2011) study collateral constraints and Gilchrist et al. (2013) firm-specific borrowing costs. Gopinath et al. (2017) and Kehrig and Vincent (2017) study the interaction of financial frictions and adjustment costs in explaining recent dynamics of misallocation in Spain and within firms, respectively. We build on this literature by exploring the implications of a different dimension of financial markets for marginal product dispersion, namely, the riskreturn tradeoff faced by risk-averse agents. The addition of aggregate risk is a key innovation of our analysis – existing work has typically abstracted from this channel (either by assuming no aggregate uncertainty or risk-neutral agents). We show that the link between aggregate risk and observed misallocation is quite tight in the presence of heterogeneous exposures to that risk.4 Papers studying additional candidates behind marginal product dispersion include Peters (2016), Edmond et al. (2018) and Haltiwanger et al. (2018) who focus on markup dispersion, David et al. (2016) information frictions and Asker et al. (2014) capital adjustment costs.5 David and Venkateswaran (2019) provide an empirical methodology to disentangle various sources of capital misallocation and establish a large role for highly persistent firm-specific factors. In our theory, firm-level risk premia manifest themselves as persistent firm-specific “wedges” of exactly this type. Kehrig (2015) documents in detail the countercyclical nature of productivity dispersion. We build on this finding by relating fluctuations in MPK dispersion to time-series variation in the price of risk. A growing literature, starting with Eisfeldt and Rampini (2006), investigates the procyclical nature of capital reallocation, which is puzzling since higher cross-sectional dispersion in MPK in downturns should lead capital to flow to highly productive, high MPK firms in recessions. Our results bear on that observation by noting that the countercyclicality of the price of risk, in conjunction with heterogeneity in firm-level risk exposures, goes some way toward reconciling this puzzle. Our work also relates to a large literature exploring the link between financial market re4 Gilchrist et al. (2013) find a limited role for firm-specific borrowing costs. Where they focus mainly on costs of debt, we find a larger role for differences in costs of equity, which is an important source of financing for the firms in our data (for example, the average leverage ratio in our sample is 0.28). Indeed, in our simple theory, the Modigliani-Miller theorem holds, i.e., all firms are able to borrow at the common risk-free rate and thus there is no dispersion in borrowing costs. Yet equity costs – and so total costs of capital – may differ widely. One contribution of our work is extending the insights in Gilchrist et al. (2013) to a broader notion of financing costs and showing that the implications for misallocation can be quite different. 5 Many papers study the role of firm-specific distortions, e.g., Bartelsman et al. (2013). Restuccia and Rogerson (2017), Hopenhayn (2014) and Eisfeldt and Shi (2018) provide excellent overviews of recent work on capital misallocation/reallocation. 6 turns and the return to capital (“investment returns”). Cochrane (1991), Restoy and Rockinger (1994) and Balvers et al. (2015) show that stock returns and investment returns are closely linked (indeed, exactly coincide under constant returns to scale). Recent work builds on this insight to examine the cross-section of stock returns from the perspective of investment returns, interpreting common risk factors through firms’ investment policies and showing that investment-based factors are priced in the cross-section of returns, e.g., Zhang (2005), Gomes et al. (2006), Liu et al. (2009) and Zhang (2017). We examine investment returns and the marginal product of capital as a joint manifestation of risk premia, most readily measured through stock returns, and extend this connection to analyze the implications for the allocation of capital and macroeconomic outcomes, such as aggregate TFP. Binsbergen and Opp (2017) also investigate the implications of asset market considerations for the real economic decisions of firms. They propose a framework where distortions in agents’ subjective beliefs lead to “alphas,” i.e., cross-sectional mis-pricings, and real efficiency losses, whereas we focus on the MPK dispersion induced by heterogeneity in aggregate risk exposures. Our empirical work establishes a connection between MPK and financial market outcomes and our quantitative work uses a workhorse macroeconomic model of firm dynamics augmented with risk-sensitive agents and aggregate risk to evaluate the implications of this connection. One of our key messages shares a common theme with this line of work – financial market considerations can have sizable effects on real outcomes by affecting capital allocation decisions.6 2 Motivation In this section, we lay out a simple version of the standard, frictionless neoclassical theory of investment to illustrate the main insight of our analysis, namely, the link between firm-level MPK and risk premia. We use this framework to motivate a number of empirical exercises exploring this connection and guide a simple back-of-the-envelope calculation that suggests a significant role for risk in generating MPK dispersion.7 Section 3 enriches this environment along several dimensions for purposes of our quantitative work. 2.1 MPK and Risk Premia Firms produce output using capital and labor according to a Cobb-Douglas technology and face constant (or infinite) elasticity demand curves. Labor is chosen period-by-period in a spot market at a competitive wage. At the end of each period, firms choose investment in new capital, 6 Relatedly, David et al. (2014) find that risk considerations play an important role in determining the allocation of capital across countries, i.e., can explain some portion of the “Lucas Paradox.” 7 All derivations for this section are in Appendix A. 7 which becomes available for production in the following period so that Kit+1 = Iit + (1 − δ) Kit , where δ is the rate of depreciation. Let Πit = Πit (Xt , Zit , Kit ) denote the operating profits of the firm – revenues net of labor costs – where Xt and Zit denote aggregate and idiosyncratic shocks to firm profitability, respectively, and Kit the firm’s stock of capital. Both Xt and Zit may be vectors, i.e., there may be multiple sources of both idiosyncratic and aggregate risk. The analysis can accommodate a number of interpretations of the fundamental shocks, for example, as productivity or demand shifters. With these assumptions, the profit function takes a Cobb-Douglas form, is homogeneous in K of degree θ < 1 (due to curvature in production and/or demand) and is proportional to revenues. The marginal product of capital is equal to Πit . The payout of the firm in period t is equal to Dit = Πit − Iit . M P Kit = θ K it Firms discount future cash flows using a stochastic discount factor (SDF), Mt+1 , which is correlated with the aggregate shock(s), Xt . We can write the firm’s problem recursively as V (Xt , Zit , Kit ) = max Πit (Xt , Zit , Kit ) − Kit+1 + (1 − δ) Kit + Et [Mt+1 V (Xt+1 , Zit+1 , Kit+1 )] , Kit+1 (1) where Et [·] denotes the firm’s conditional expectations. The Euler equation is given by 1 = Et [Mt+1 (M P Kit+1 + 1 − δ)] ∀ i, t . (2) MPK dispersion. An immediate implication of expression (2) is that MPK (or even expected MPK) need not be equated across firms; rather, it is only expected, appropriately discounted MPK that is equalized. To the extent that firms’ MPK co-move differently with the SDF, their expected MPK will differ. From here, we can derive the following equation for expected MPK: Et [M P Kit+1 ] = αt + βit λt . (3) Here, αt = rf t +δ is the risk-free user cost of capital, where rf t is the (net) risk-free interest rate, t+1 ,M P Kit+1 ) captures the elasticity, or exposure, of the firm’s MPK to movements βit ≡ − covt (M vart (Mt+1 ) t (Mt+1 ) is the market price of that in the SDF – i.e., the riskiness of the firm – and λt ≡ var Et [Mt+1 ] risk. Expression (3) illustrates the first main insight: expected MPK is not necessarily common across firms, but rather is a function of the (common) risk-free rate and a firm-specific risk premium, which depends on the firm’s beta on the SDF – which may vary across firms – and the market price of risk. The cross-sectional variance of date t conditional expected MPK is σE2 t [M P Kit+1 ] = σβ2t λ2t , 8 (4) where σβ2t is the cross-sectional variance of time t betas. Expression (4) reveals the second main insight: the extent to which risk considerations lead to dispersion in expected MPK is increasing in the price of risk (and in the cross-sectional variation in risk exposures). A key observation underlying our analysis is that financial market data imply that risk prices are high (e.g., a large equity premium and observed Sharpe ratios on various investment strategies), suggesting a potentially important role for differences in risk exposure in leading to MPK dispersion. Further, given persistence in firm-level betas, the theory implies persistent differences in firmlevel MPK – as observed in the data – driven by dispersion in required rates of return.8 Examples. It is useful to consider a few concrete illustrative examples: Example 1: no aggregate risk (or risk neutrality). In the case of no aggregate risk, we have βit = 0 ∀ i, t, i.e., all shocks are idiosyncratic to the firm. Expressions (3) and (4) show that there will be no dispersion in expected MPK and for each firm, Et [M P Kit+1 ] = rf + δ, which is simply the riskless user cost of capital (which is constant in the absence of aggregate shocks). This is the standard result from the stationary models widely used in the misallocation literature where, without additional frictions, expected MPK should be equalized across firms.9 The same result holds in an environment with aggregate shocks but risk neutral preferences, which implies Mt+1 is simply a constant (equal to the time discount factor). Example 2: CRRA preferences. In the case of CRRA utility with coefficient of relative risk aversion γ, standard approximation techniques give Et [M P Kit+1 ] = αt + covt (∆ct+1 , M P Kit+1 ) γvar (∆c ) , | t {z t+1} vart (∆ct+1 ) | {z } λt βit where ∆ct+1 denotes log consumption growth. Expected MPK is determined by the covariance of the firm’s MPK with consumption growth. The price of risk is the product of the coefficient of relative risk aversion and the conditional volatility of consumption growth. Example 3: CAPM. In the CAPM, the SDF is linearly related to the return on the aggregate stock market, i.e., Mt+1 = a − brmt+1 for some constants a and b. Because the market portfolio 8 To see this more clearly, we can take the unconditional expectation of equation (3) to obtain an approximate 2 2 2 2 2 expression for the variance of firms’ mean MPKs as σE[M P K] ≈ σβ λ , where σβ ≡ σE[βit ] denotes the variance of unconditional betas and λ ≡ E [λt ] the unconditional expectation of the price of risk. The approximation is valid as long as cov (βi , cov (βit , λt )) is small. In line with the results in Lewellen and Nagel (2006), we find the time-series variation in betas to be quite modest. Further, they are persistent (for example, we find that CAPM betas have an implied one-year autocorrelation of 0.87). In the case of constant betas or if βit is orthogonal to λt , the expression is exact. 9 With time-to build for capital and uncertainty over upcoming shocks, there may still be dispersion in realized MPK, but not in expected terms, and so these forces do not lead to persistent firm-level MPK deviations. 9 is itself an asset with β = 1, it is straightforward to derive Et [M P Kit+1 ] = αt + covt (rmt+1 , M P Kit+1 ) E [r − rf t ] , | t mt+1 {z } vart (rmt+1 ) | {z } λt βit i.e., expected MPK is determined by the covariance of the firm’s MPK with the market return, which is the aggregate risk factor in this environment. The price of risk is equal to the expected excess return on the market portfolio, i.e., the equity premium. Extensions. As can be seen from this simple framework, the link between firm-level MPK and risk is quite general and does not depend on specific assumptions about the SDF. In Appendix A.2, we provide two additional examples to show that the connection holds under alternative assumptions on the sources of risk premia. Specifically, we study versions where firm-specific risk is due to (i) firm-specific distortions to the pricing of risk (which show up as “alphas” or “mis-pricing”) and (ii) heterogeneity in firm exposures to cyclical fluctuations in the price of investment goods (in the spirit of, e.g., Kogan and Papanikolaou (2013)). In both cases, differences in firm-level risk premia leads to MPK dispersion exactly as above, i.e, an expression analogous to (3) continues to hold. The main message from these excercises is that the link between MPK and risk is quite robust and does not depend on the precise source of differences in risk premia (although the normative implications, e.g., whether the resulting MPK dispersion is efficient or not – and so represents a true “misallocation” – may). In our quantitative model in Section 3, we add capital adjustment costs, which lead to endogenous fluctuations in the value of installed capital, i.e., Tobin’s Q, and study the role of additional factors (e.g., firm-level distortions) that determine expected MPK as emphasized in the misallocation literature (e.g., Hsieh and Klenow (2009)). In these cases, equations (3) and (4) do not hold exactly, since there are now additional forces generating MPK dispersion, but we develop an empirical strategy that accurately captures the portion of this dispersion that comes from risk effects alone, even in the presence of these other factors. 2.2 Empirical Evidence In this section, we explore the two key implications of the simple framework laid out thus far: (i) exposure to aggregate risk – and hence, the firm-specific risk premium – is a determinant of expected MPK and (ii) MPK dispersion is increasing in the price of risk. Measuring risk premia. In addition to influencing firms’ capital choices, exposure to aggregate risk affects firms’ stock returns. Indeed, in our model, expected excess stock returns 10 over the risk-free rate (i.e., the predictable component of the excess return) reflects compensation for risk exposure and thus represents a direct measure of the firm-level risk premium. We can exploit this link to use well-studied measures of firm risk exposures and risk premia from stock market data to explore the connection with MPK. To motivate this approach, we derive the following approximate expressions for firm-level expected excess MPK (over the user cost), e e denoted M P Kit+1 , and excess stock market returns, Rit+1 : e e = −covt (mpkit+1 , mt+1 ) ≡ log Et M P Kit+1 Empkit+1 e e = −ψcovt (mpkit+1 , mt+1 ) , ≡ log Et Rit+1 Erit+1 (5) (6) where ψ is a constant and lowercase denotes natural logs. To a first-order approximation, expected stock returns and MPK are proportional, since they are jointly determined by the underlying risk characteristics of the firm.10 Combining, we have e Empkit+1 1 e e = Erit+1 ⇒ var Empkit+1 = ψ 2 1 e var Erit+1 , ψ (7) which shows that (in logs) the cross-sectional dispersion of expected MPK is proportional to the cross-sectional dispersion of expected stock market returns. We can use expression (7) to (i) verify the link between firm-level MPK and risk premia, measured from expected stock returns and (ii) calculate a back-of-the-envelope estimate of the MPK dispersion that may stem from heterogeneous risk premia. Firm-level MPK and risk. Expression (7) reveals a tight link between expected MPK and expected stock returns. However, empirically implementing this equation is challenging since neither expected MPK nor expected returns are directly observable. One approach is to proxy these variables with their realized values, but this may be problematic: since both variables respond to the same realizations of shocks, a positive relationship may be mechanical and not indicative of the relationship between expected values. To overcome these hurdles, we follow a two-stage instrument variables approach in which we instrument for returns using common measures of risk exposure. First, in a preliminary step, we estimate time-varying risk exposures from backwards-looking rolling window regressions of individual firm stock returns on aggregate risk factors. Then, in the first stage, we estimate period-by-period cross-sectional regressions of realized stock market returns on the estimated exposures. The predicted values from these regressions yield measures of expected excess returns, i.e., risk premia, that are driven only 10 The proportionality is exact under a single source of aggregate risk. A slightly modified expression holds with multiple sources. 11 by exposure to the aggregate risk factors considered.11 Next, we estimate the second stage regression implied by equation (7) using a panel specification with these predicted values as instruments. Because the focus in the misallocation literature is generally on within-industry dispersion in MPK (MPK may vary across industries due to heterogeneity on a number of additional dimensions), we include a full set of industry-by-year fixed effects. We perform this procedure using three different sets of aggregate risk factors, taken from three common asset pricing models: the CAPM, the Fama-French 3 Factor model, and the Hou et al. (2015) q 5 5 factor model. In the CAPM, the aggregate market return is the single source of aggregate risk. The latter two models add returns on additional portfolios of firms sorted by a number of different characteristics. We provide further details in Appendix B.1. In principle, once we have a measure of expected returns, proxying expected MPK with realized MPK should be innocuous, since the forecast errors in future realizations of MPK – the difference between the two measures – should be uncorrelated with backwards-looking data on stock returns. However, for completeness, we also construct a measure of expected MPK as follows: under our assumptions, realized (log) MPK is given by mpkit = yit − kit (we suppress constants that play no role), where yit denotes (log) revenue. We assume that firmlevel productivity, equal to ait = yit − θkit , follows an AR(1) process with persistence ρa . Then, expected MPK is given by Et [mpkit+1 ] = Et [yit+1 ] − kit+1 = ρa ait − (1 − θ) kit+1 . Consistent with our estimates in Section 4 below, we use ρa = 0.94 and θ = 0.62.12 We implement this approach using firm-level data from the Center for Research in Security Prices (CRSP) and Compustat.13 We measure firm capital stock, Kit , as the (net of depreciation) value of property, plant and equipment and firm revenue, Yit , as reported sales.14 Full details of the data sample are in Appendix B.1. Table 1 reports results from regressions of MPK and expected MPK on lagged expected stock market returns. The left-hand panel shows that each of the asset pricing models we consider yields a statistically significant relationship between expected returns and future realized MPK.15 The economic magnitudes are also significant: the estimated coefficients imply that that a 100 basis point increase in the expected return is associated with a 3-5% increase in 11 Because the estimates of risk exposures use data only through time t, they forecast only the expected component of realized returns at time t + 1. 12 A simplifying assumption here is that the aggregate and idiosyncratic components of ait have the same persistence. Our quantitative work in Section (4) suggests this is a reasonable approximation, where we find the persistences of the two components to be quite close. 13 By using Compustat data, our analysis focuses on large, publicly traded firms for which financial market data are available. Extending the analysis to private firms would be a valuable exercise, but faces a significant challenge in deriving accurate measures of risk premia (our alternative approach in Section 4.4 may be one way). 14 Using book assets, a broader notion of firm capital, yields similar results. 15 Standard errors are two-way clustered by firm and year, but do not account for estimation error in measured risk exposures. In Appendix C, we perform a bootstrapping procedure designed to address this issue. 12 Table 1: MPK and Risk Premia mpk Ecapm [ret] Ef f 3 [ret] Eq5 [ret] (1) 4.798∗∗∗ (4.08) (2) E[mpk] (3) 3.284∗∗∗ (6.80) (4) 5.388∗∗∗ (4.62) (5) (6) 4.159∗∗∗ (8.70) 5.016∗∗∗ (8.22) 6.299∗∗∗ (10.71) Notes: This table reports results from a two-step procedure in which we estimate the elasticity of firm-level MPK to expected excess stock returns. In the first stage, we instrument for expected returns using measures of firm risk exposures from stock market data. In the second stage, we run regressions of the form (7) with industry-by-year fixed effects. Standard errors are two-way clustered by firm and year. t-statistics in parentheses. Significance levels are denoted by: * p < 0.10, ** p < 0.05, *** p < 0.01. MPK. The right-hand panel reports analogous results using measures of expected, rather than realized, MPK. The estimated coefficients are similar, and if anything, slightly larger, though the differences are not statistically significant. Thus, our two-stage approach reveals a significant relationship between firm-level risk premia, instrumented using common measures of risk exposure from stock market data, and MPK. The results in Table 1 suggest a simple back-of-the-envelope calculation of the role of risk premia in driving MPK dispersion. As an example, consider the Fama-French 3 factor model, perhaps the most widely used framework to study the cross-section of stock returns. Using this model to estimate expected returns yields a within-industry standard deviation of about 0.127 (details in Appendix B.2). Multiplying this value by the estimated elasticity of MPK to expected returns in Table 1 (3.3) and squaring yields a predicted cross-sectional variance of mpk of about 0.17. The total within-industry variance of mpk in the Compustat sample is about 0.45, implying that variation in risk premia can account for almost 40% of the total. A similar calculation using expected MPK gives a slightly larger share. Although suggestive, this calculation ignores a number of factors that may affect the relationship between MPK and returns, e.g., possible nonlinearities, adjustment costs and/or other investment distortions of the type that have been emphasized in the misallocation literature. In Sections 3 and 4, we develop a quantitative model and empirical strategy that incorporates these additional considerations and allows for counterfactual decompositions of the role of risk effects in driving observed MPK dispersion. The results from the more general model turn out to be broadly in line with (though slightly smaller than) the simple calculation performed here. MPK dispersion and the price of risk. Expression (4) illustrates a second key implication of the simple framework: MPK dispersion is positively related to the price of risk. To explore 13 Table 2: MPK Dispersion and the Price of Risk σ (mpkit+1 ) Excess Bond Premia (1) 0.357∗∗∗ (3.08) σ (Et [mpkit+1 ])) (2) (3) 0.191∗∗∗ (2.75) GZ Spread 174 0.0644 (5) (6) 0.338∗∗∗ (5.68) -0.176∗∗∗ (-3.51) 174 0.0660 PD Ratio Observations R-squared (4) 0.633∗∗∗ (6.89) 174 0.0653 174 0.218 174 0.221 -0.228∗∗∗ (-4.09) 174 0.117 Notes: This table reports time-series regressions of four-quarter ahead mpk dispersion on measures of the price of risk. t-statistics are in parentheses, computed using Newey-West standard errors. Significance levels are denoted by: * p < 0.10, ** p < 0.05, *** p < 0.01. this prediction, we estimate time-series regressions of the form σ (mpkt+1 ) = ψ0 + ψ1 λt + ζt+1 , where λt denotes three different proxies for the price of risk: the price/dividend (PD) ratio on the aggregate stock market and two measures of credit spreads – the Gilchrist and Zakrajsek (2012) (GZ) spread, a high-information and duration-adjusted measure of the mean credit spread and the excess bond premium, which measures the portion of the GZ spread not attributable to default risk.16 Table 2 reports regressions of the within-industry standard deviations of MPK (left-hand panel) and expected MPK (right-hand panel) on the lagged values of these measures.17 Each predicts MPK dispersion, and in the direction the theory suggests: the GZ Spread and excess bond premium predict greater MPK dispersion, while a higher PD ratio predicts lower dispersion. Because the measures of the price of risk are countercyclical, the results imply that time-series variation in risk premia induce a countercyclical component in MPK dispersion, in line with (and potentially in part accounting for) the well known evidence of countercyclicality documented in Eisfeldt and Rampini (2006) and Kehrig (2015).18 16 We extract the cyclical component of the PD ratio and MPK dispersion using a one-sided Hodrick-Prescott filter. The credit spread measures do not exhibit significant longer-term trends. 17 Within-industry standard deviations are calculated by first de-meaning mpk (expected mpk) by industryyear. To control for the changing composition of firms, for each quarter, we include only firms that were present in the previous quarter and calculate changes in the standard deviation for these firms. We then use those changes to construct a composition-adjusted series that is unaffected by new additions or deletions from the dataset. Details of this procedure are in Appendix B.1. 18 We report the correlations of these measures with de-trended GDP and TFP in Table 6 in Appendix B.3. 14 Additional exercises. In Appendix C, we explore a number of additional implications of our framework. We use MPK-sorted portfolios of firms to show that high MPK firms tend to offer higher expected stock market returns. We verify that firm-level MPK is related to direct measures of the sensitivity of MPK (rather than stock returns) to aggregate risk, as suggested by (3). We show that industries with greater MPK dispersion also tend to be those with more dispersion in expected returns and heterogeneity in firm-level risk exposures. 3 Quantitative Model In this section, we use a more detailed version of the investment model laid out above to quantitatively investigate the contribution of heterogeneous risk premia to observed MPK dispersion. The model is kept deliberately simple in order to isolate the role of our basic mechanism, namely dispersion in exposure to macroeconomic risk. The theory consists of two main building blocks: (i) a stochastic discount factor, which we directly parameterize to be consistent with salient patterns in financial markets, i.e., high and countercyclical prices of risk and (ii) a cross-section of heterogeneous firms, which make optimal investment decisions in the presence of firm-level and aggregate risk, given the stochastic discount factor. Specifying the stochastic discount factor exogenously allows us to sidetrack challenges with generating empirically relevant risk prices in general equilibrium, and focus on gauging the quantitative strength of our mechanism. To hone in on the effects of risk premia, we begin with a simplified version in which we abstract from additional adjustment frictions. In this case, our framework yields exact closed form solutions for firm investment decisions. In Section 3.2, we extend the model to include capital adjustment costs. Our theoretical results there reveal an important amplification effect of these costs on the impact of risk premia.19 Heterogeneity in risk exposures. The setup is a fleshed-out version of that in Section 2. We consider a discrete time, infinite-horizon economy. A continuum of firms of fixed measure one, indexed by i, produce a homogeneous good using capital and labor according to: Yit = Xtβ̂i Ẑit Kitθ1 Nitθ2 , θ1 + θ2 < 1 . (8) Firm productivity (in logs) is equal to β̂i xt + ẑit , where xt denotes an aggregate component that is common across firms and β̂i captures the exposure of the productivity of firm i ¯ 2 to aggregate conditions. We assume that β̂i is distributed as β̂i ∼ N β̂, σβ̂ across firms. Heterogeneity in this exposure is a key ingredient of our framework – cross-sectional variation 19 We also consider the effects of other investment frictions, e.g., “wedges,” or distortions, in Section 4.3. 15 in β̂i will lead directly to dispersion in expected MPK. The term ẑit denotes a firm-specific, idiosyncratic component of productivity.20 The two productivity components follow AR(1) processes (in logs): xt+1 = ρx xt + εt+1 , ẑit+1 = ρz ẑit + ε̂it+1 , εt+1 ∼ N 0, σε2 2 ε̂it+1 ∼ N 0, σε̂ (9) . Thus, there are two sources of uncertainty at the firm level – aggregate uncertainty, with conditional variance σε2 , and idiosyncratic uncertainty, with variance σ̂ε̃2 . Stochastic discount factor. In line with the large literature on cross-sectional asset pricing in production economies, we parameterize directly the pricing kernel without explicitly modeling the consumer’s problem. In particular, we specify the SDF as 1 log Mt+1 ≡ mt+1 = log ρ − γt εt+1 − γt2 σε2 2 γt = γ0 + γ1 xt , (10) where ρ, γ0 > 0 and γ1 ≤ 0 are constant parameters.21 The SDF is determined by shocks to aggregate productivity. The conditional volatility of the SDF, σm = γt σε , varies through time as determined by γt . This formulation allows us to capture in a simple manner a high, time-varying and countercyclical price of risk as observed in the data (since γ1 < 0, γt is higher following economic contractions, i.e., when xt is negative). Additionally, directly parameterizing γ0 and γ1 enables the model to be quantitatively consistent with key moments of asset returns, which are important for our analysis. The risk free rate is constant and equal to − log ρ. Thus, γ0 and γ1 only affect the properties of equity returns., easing the interpretation of these parameters. The maximum attainable Sharpe ratio is equal to the conditional standard deviation of the SDF, i.e., SRt = γt σε , and the price of risk is equal to the square of the Sharpe ratio, γt2 σε2 . For simplicity, the setup thus far features (i) a single source of aggregate risk and (ii) a tight link between financial market conditions (i.e., γt ) and macroeconomic conditions (i.e., xt ). In Appendix F we extend this framework to (i) include multiple risk factors and (ii) decouple movements in financial and macroeconomic conditions by including pure financial shocks that affect the price of risk but otherwise do not impact firm profits/productivity. Similar insights from the simpler model go through under those extensions. 20 More broadly, expression (8) should be thought of as a revenue-generating function and the “productivity” components as also capturing demand factors, see, e.g., Section 5. 21 This specification builds closely on those in, for example, Zhang (2005), Gomes and Schmid (2010) and Jones and Tuzel (2013). 16 Input choices. Firms hire labor period-by-period at a competitive wage, Wt . To keep the labor market simple, we assume that the equilibrium wage is given by Wt = Xtω , i.e., the wage is a constant elasticity and increasing function of aggregate productivity, where ω ∈ [0, 1] determines the sensitivity of wages to aggregate conditions.22 Maximizing over the static labor decision gives operating profits – revenues less labor costs – as Πit = GXtβi Zit Kitθ , θ2 1−θ2 (11) 1 θ1 where G ≡ (1 − θ2 ) θ2 , βi ≡ β̂i − ωθ2 , Zit ≡ Ẑit1−θ2 and θ ≡ 1−θ . The exposure of 2 firm profits to aggregate conditions is captured by βi , which is a simple transformation of the underlying exposure of firm productivity to the aggregate component, β̂i , and the sensitivity 1 of wages, ω.23 The idiosyncratic component of productivity is similarly scaled, by 1−θ . The 2 curvature of the profit function is equal to θ, which depends on the relative elasticities of capital and labor in production. These scalings reflect the leverage effects of labor liabilities on profits. From here on, we will primarily work with zit , which has the same persistence as ẑit , i.e., ρz , 1 1−θ2 2 1 1 ε̂t+1 with variance σε̃2 = 1−θ σε̂2 . We will also use the fact that and innovations εit+1 = 1−θ 2 2 2 1 σβ2 = 1−θ σβ̂2 . Notice that the profit function takes precisely the form assumed in Section 2 2. Thus, the firm’s dynamic investment problem takes the form in expression (1). Optimal investment. The simplicity of this setting leads to exact analytical expressions for the firm’s investment decision. Specifically, we show in Appendix D.1 that the firm’s optimal investment policy is given by: kit+1 = 1 α̃ + βi ρx xt + ρz zit − βi γt σε2 , 1−θ (12) where α̃ ≡ log θ + log G − α, α ≡ log (rf + δ).24 The firm’s choice of capital is increasing in xt and zit due to their direct effect on expected future productivity (i.e., βi ρx xt + ρz zit = Et [βi xt+1 + zit+1 ]), but, ceteris paribus, firms with higher betas choose a lower level of capital. The magnitude of this effect is larger when γt is large, i.e., in economic downturns. Clearly, 22 This setup follows, for example, Belo et al. (2014) and İmrohoroğlu and Tüzel (2014). The adjustment term for labor supply, ωθ2 , has a small effect on the mean of the β distribution, but otherwise does not affect our analysis. 24 More precisely, there are also terms that reflect the variance of shocks. Because these terms are negligible and play no role in our analysis (they are independent of the risk premium effects we measure), we suppress them here. The full expressions are given in Appendix D.1. 23 17 with risk neutrality, i.e., γ0 = γ1 = 0, the last term is zero and investment is purely determined by expected productivity. For a slightly different intuition, we substitute for γt and write the expression as kit+1 = 1 α̃ + βi ρx − γ1 σε2 xt + ρz zit − βi γ0 σε2 . 1−θ (13) The risk premium affects the capital choice through both the time-varying and constant components of the price of risk: first, a more negative γ1 increases the responsiveness of firms to aggregate conditions. Intuitively, a high (low) realization of xt has two effects – first, since xt is persistent, it signals that productivity is likely to be high (low) in the future, increasing (decreasing) investment (this force is captured by the ρx term). Moreover, a high (low) realization of xt implies a low (high) price of risk, which further increases (decreases) investment. Second, the constant component of the risk premium, γ0 , adds a firm-specific constant – i.e., a fixed effect – which leads to permanent dispersion in firm-level capital. MPK dispersion. By definition, the realized mpk is given by mpkit+1 = log θ + πit+1 − kit+1 . Substituting for kit+1 , mpkit+1 = α + εit+1 + βi εt+1 + βi γt σε2 , (14) and taking conditional expectations, Empkit+1 ≡ Et [mpkit+1 ] = α + βi γt σε2 , (15) where α is as defined in equation (12) and reflects the risk-free user cost of capital. Expression (14) shows that dispersion in the realized mpk can stem from uncertainty over the realization of shocks, as well as the risk premium term, which is persistent at the firm level and depends on (i) the firm’s exposure to the aggregate shock, βi (and is increasing in βi ), and (ii) the time t price of risk, which is reflected in the term γt σε2 . Intuitively, firm-level mpk deviations are composed of both a transitory component due to uncertainty and a persistent component due to the risk premium. The transitory components are i.i.d. over time and lead to purely temporary deviations in mpk (even though the underlying productivity processes are autocorrelated); the risk premium, on the other hand, leads to persistent deviations – firms that are more exposed to aggregate shocks, and so are riskier, will have persistently high mpk. Expression (15) hones in on this second force and shows the persistent effects of risk premia on the conditional expectation of time t+1 mpk, denoted Empk. Indeed, in this simple case, the ranking of firms’ mpk will be constant in expectation as determined by the risk premium – high beta firms will have permanently high Empk and low beta firms the opposite. Importantly, 18 the value of Empk will fluctuate with γt , but the ordering across firms will be preserved. This is the sense that we call this component persistent/permanent. Expression (14) shows that this ordering will not be preserved in realized mpk – due to the realization of shocks, the ranking of firms’ mpk will fluctuate, but the firm-specific risk premium adds a persistent component.25 Because the uncertainty portion of the realized mpk is always additively separable and is independent of our mechanism, from here on we primarily work with Empk. Expression (16) presents the cross-sectional variance of Empk: 2 σEmpk ≡ σE2 t [mpkit+1 ] = σβ2 γt σε2 t 2 . (16) Cross-sectional variation in Empk depends on the dispersion in beta and the price of risk. Dispersion will be greater when risk prices, reflected by γt σε2 , are high and so will be countercyclical. The average long-run level of Empk dispersion is given by 2 2 2 2 2 2 2 2 EσEmpk ≡ E σEmpk = σ γ + γ σ σε β 0 1 x t where σx2 = σε2 . 1 − ρ2x (17) Aggregate outcomes. Appendix D.3 shows that aggregate output can be expressed as log Yt+1 ≡ yt+1 = at+1 + θ1 kt+1 + θ2 nt+1 , where kt+1 denotes the aggregate capital stock, nt+1 aggregate labor and at+1 the level of aggregate TFP, given by 1 θ1 (1 − θ2 ) 2 at+1 = a∗t+1 − σ , (18) 2 1 − θ1 − θ2 mpk,t+1 2 where σmpk,t+1 is realized mpk dispersion in period t + 1. The term a∗t+1 is the first-best level of TFP in the absence of any frictions (i.e., where marginal products are equalized). Thus, aggregate TFP monotonically decreases in the extent of capital “misallocation,” captured by 2 σmpk . The effect of misallocation on aggregate TFP depends on the overall curvature in the production function, θ1 + θ2 and the relative shares of capital and labor. The higher is θ1 + θ2 , that is, the closer to constant returns to scale, the more severe the losses from mis-allocated resources. Similarly, fixing the degree of overall returns to scale, for a larger capital share, θ1 , a given degree of misallocation has larger effects on aggregate outcomes. Using equation (16), the conditional expectation of one-period ahead TFP is given by 2 1 θ1 (1 − θ2 ) 2 Et [at+1 ] = Et a∗t+1 − σβ γt σε2 . 2 1 − θ1 + θ2 25 (19) With additional adjustment frictions, there will be other factors confounding the relationship between beta and the realized and expected mpk. 19 The expression shows that risk premium effects unambiguously reduce aggregate TFP and disproportionately more so in business cycle downturns, since γt is countercyclical. Taking unconditional expectations gives the effects on the average long-run level of TFP in the economy: a ≡ E [Et [at+1 ]] = a∗ − 2 1 θ1 (1 − θ2 ) 2 2 σβ γ0 + γ12 σx2 σε2 . 2 1 − θ1 + θ2 (20) The expression directly links the extent of cross-sectional dispersion in required rates of return (which are in turn determined by the prices of risk and volatility of aggregate shocks) to the long-run level of aggregate productivity and gives a natural way to quantify the implications of these effects. Further, and perhaps more importantly, it uncovers a new connection between aggregate volatility and long-run economic outcomes, i.e., a “productivity cost” of business cycles – ceteris paribus, the higher is aggregate volatility (σε2 and σx2 in the expression), the more depressed will be the average long-run level of TFP (relative to an environment with no aggregate shocks and/or risk premia). In Appendix F, we show that our model can be extended to include multiple sources of aggregate risk and to allow γt to depend on additional factors beyond the state of technology and so expressions (19) and (20) provide a more general connection between financial conditions (that may be less than perfectly correlated with the real economy), the cross-sectional allocation of resources and aggregate TFP.26 Thus, more broadly, these expressions provide one way to link the rich findings of the literature on cross-sectional asset pricing to real allocations and macroeconomic outcomes. 3.1 The Cross-Section of Expected Stock Returns and MPK In this section, we derive a sharp link between a firm’s beta – and so expected mpk – and its expected stock market return, along the lines developed in Section 2.2. This connection suggests an empirical strategy to measure the dispersion in beta and so quantify the mpk dispersion that arises from risk considerations using stock market data. Our key finding is that, to a first-order approximation, the firm’s expected stock return is a linear (and increasing) function of its beta.27 The implication is that, in the simple model outlined thus far, expected mpk is proportional to expected stock returns, and thus, the dispersion in expected stock returns puts tight empirical discipline on the dispersion in betas and so expected mpk arising from risk channels. We use this connection to provide transparent intuition for our numerical approach in Section 4. 26 Further, we can verify that the additional extensions discussed in Section 2.1, i.e., firm-specific “alphas” and heterogeneous exposures to investment goods prices have similar implications. 27 It is well known that a first-order approximation may not be sufficient to capture risk premia. In our quantitative work in Section 4, we work with numerical higher order approximations. 20 We obtain an analytic approximation for expected stock market returns by log-linearizing around the non-stochastic steady state where Xt = Zt = 1. To a first-order, the (log of the) expected excess stock return is equal to (derivations in Appendix D.4) e e Erit+1 ≡ log Et Rit+1 = ψβi γt σε2 , where ψ= 1 ρ 1 ρ (21) +δ−1 1−ρ . + δ (1 − θ) − 1 1 − ρρx + ργ1 σε2 The expected excess return depends on the firm’s beta (indeed, is linear and increasing in beta) and is increasing in the price of risk. Because the price of risk is countercyclical, risk premia increase during downturns for all firms and fall during expansions.28 The time t cross-sectional dispersion in expected excess returns is given by 2 2 2 2 2 σEr e ≡ σ e log Et [Rit+1 t ] = ψ σβ γt σε 2 . (22) Similar to our findings for expected mpk, the expression reveals a tight link between beta dispersion and expected stock return dispersion. Indeed, if firms had identical betas, dispersion in expected returns would be zero. Moreover, as with expected mpk dispersion, expected stock return dispersion is increasing in the price of risk and so is countercyclical. Comparing equations (15) and (21) shows that expected excess returns are proportional to 2 2 expected mpk and equations (16) and (22) show that σEr e is proportional to σEmpk . Thus, t t the expressions reveal a tight connection between cross-sectional dispersion in expected stock returns and expected mpk – both are dependent on the variation in betas. Although the exact proportionality will not hold exactly in the full non-linear solution – or in the presence of other frictions/distortions to firm investment decisions – we will use this intuition to quantify the role of risk considerations in generating dispersion in expected mpk. Specifically, these results suggest an empirical strategy to estimate the three key structural parameters – γ0 , γ1 and σβ2 – using readily available stock market data. First, it is straightforward to verify that the market index – i.e., a perfectly diversified portfolio with no idiosyncratic Strictly speaking, these results hold in the approximation so long as 1 − ρρx + ργ1 σε2 > 0. This condition does not play a role in the numerical solution. 28 21 risk – achieves the maximal Sharpe ratio:29 SRmt = γt σε , ESRm ≡ E [SRmt ] = γ0 σε . (23) The expression links the market Sharpe ratio to γ0 . Indeed, in this linearized environment, the mapping is one-to-one (given σε2 ). Next, deriving equation (21) for the market index gives Ermt+1 = ψ β̄γt σε2 , Erm ≡ E [Ermt+1 ] = ψ β̄γ0 σε2 . (24) For a given value of γ0 , the equity premium is increasing as γ1 becomes more negative through its effects on ψ (β̄ denotes the mean beta across firms). Lastly, equation (22) connects dispersion in beta, σβ2 , to dispersion in expected returns. Together, equations (22), (23) and (24) tightly link three observable moments of asset market data to the three parameters, γ0 , γ1 and σβ2 . 3.2 Adjustment Costs In this section, we extend the framework to include capital adjustment costs. Although the main insights from the previous sections go through, we illustrate an important interaction between these costs and the effects of risk premia, namely, adjustment costs amplify the impact of these systematic risk exposures on mpk dispersion. We assume that capital investment is subject to quadratic adjustment costs, given by ξ Φ (Iit , Kit ) = 2 Iit −δ Kit 2 Kit . With these costs, the return on capital is no longer equal to the MPK plus the undepreciated capital stock, but depends on endogenous fluctuations in the value of installed capital, i.e., Tobin’s Q. Although exact analytic solutions are no longer available as in the simpler case, a first-order approximation yields the return on capital (the investment return) to be30 I rit+1 = (1 − ρ (1 − δ)) mpkit+1 + ρqit+1 − qit . (25) The investment return depends on mpkit+1 , but additionally on qit and qit+1 , where qit ≡ 29 The Sharpe ratio for an individual firm is SRit = s βi γt σε2 2 2 1−ρρx +ργ1 σε 1−ρρz , which shows that, due to the σε̃2 +βi2 σε2 presence of idiosyncratic risk, individual firms do not attain the maximum Sharpe ratio. However, in this linear environment, the diversified index faces no risk from σε̃2 , so that the expression collapses to (23). Although in the full numerical solution the market may not exactly attain this value due to the nonlinear effects of idiosyncratic shocks, the expression highlights that the market Sharpe ratio is informative about γ0 . 30 Throughout this section, we suppress constant terms that play no role. 22 ξ (kit+1 − kit ). As above, the risk premium on capital is equal to (the negative of) the covariance of the return with the SDF, i.e., I = −covt ((1 − ρ (1 − δ)) mpkit+1 + ρqit+1 , mt+1 ) , log Et rit+1 (26) which shows that adjustment costs add an additional element to the risk premium through endogenous changes in qit that are correlated with the SDF. Appendix D.2 derives the log-linearized version of the firm’s optimal investment policy:31 kit+1 = φ1 βi xt + φ2 zit + φ3 kit − φ4 βi γ0 σε2 , (27) where 0 = φ1 = φ4 ˆ 2 + ξˆ (θ − 1) − ξˆ (1 + ρ) φ3 + ρξφ 3 (ρx − γ1 σε2 ) φ3 , φ2 = ρz φ3 ξˆ (1 − ρφ3 (ρx − γ1 σε2 )) ξˆ (1 − ρρz φ3 ) φ3 1 = , ˆ ξ (1 − ρφ3 ) 1 − ρφ3 (ρx − γ1 σε2 ) ξ and ξˆ ≡ 1−ρ(1−δ) is a composite parameter that captures the severity of adjustment costs. Now, the past level of capital affects the new chosen level. The coefficient φ3 captures the strength of this relationship. It lies between zero and one and is increasing in the adjustment ˆ It is independent of the risk premium. The other coefficients each have a counterpart cost, ξ. in equation (13), but are modified to reflect the influence of adjustment costs. The coefficients φ1 and φ2 are both decreasing in these costs – intuitively, adjustment costs reduce the firm’s responsiveness to transitory shocks. Importantly, φ4 is increasing in these costs, showing that they increase the importance of the firm’s beta in determining its choice of capital.32 The expression for φ4 also reveals an interaction between adjustment costs and time-varying risk – the denominator contains the product of φ3 and γ1 , which implies that a more negative γ1 leads to higher values of φ4 as long as adjustment costs are non-zero. By increasing the value of φ4 , this interaction effect strengthens the impact of beta dispersion on Empk dispersion. From here, we can derive the following expression for conditional expected mpk: Et [mpkit+1 ] = 1 ˆ t [kit+2 − kit+1 ] − ξˆ (kit+1 − kit ) . βi γt σε2 + ρξE 1 − ρφ3 (ρx − γ1 σε2 ) 31 (28) As above, we ignore terms reflecting variance adjustments that are close to zero. Strictly speaking, this is true so long as 1 − ρφ3 ρx − γ1 σε2 > 0. This condition holds for any reasonable level of adjustment costs, for example, given our estimates of the other parameters, ξ must be less than approximately 2180. 32 23 Expected mpk depends on both the risk premium and adjustment costs (realized mpk also depends on the realization of shocks, as above). The last two terms capture the effects of adjustment costs alone and, conditional on current and expected future capital stocks, are independent of aggregate risk. The first term captures the risk premium. Without adjustment costs, φ3 = ξˆ = 0, and the risk premium is identical to expression (13)). The risk premium is increasing in those costs (i.e., as φ3 gets larger), showing that adjustment costs amplify risk premium effects. Intuitively, as shown in (26), adjustment costs add an additional source of co-movement of capital returns with the SDF through fluctuations in firm-level Q. High beta firms invest more in expansions (both because their expected productivity is high and the cost of capital is low due to a low price of risk) and so the Q of these firms is negatively correlated with the SDF (e.g., is more procyclical), inducing a higher risk premium.33 The extended model continues to give rise to mpk deviations that are persistent at the firm-level. In particular, taking the unconditional expectation of (28) yields expressions for the persistent component of Empk and its cross-sectional dispersion: E [Empkit+1 ] = ⇒ 2 σE[Empk it+1 ] 1 βi γ0 σε2 1 − ρφ3 (ρx − γ1 σε2 ) = 1 1 − ρφ3 (ρx − γ1 σε2 ) 2 γ0 σε2 (29) 2 σβ2 . (30) The risk premium is an essential ingredient for the model to generate persistence in firm-level mpk; adjustment costs are not sufficient. When γ0 = 0 and hence risk effects are absent, there is no persistent Empk dispersion, even with adjustment costs (beta dispersion is also necessary). Thus, on their own, adjustment costs generate dispersion only in the transitory component of mpk (through their role in expression (28)). However, (29) and (30) show that when the persistent risk premium component is present, adjustment costs have a further amplification effect on that component, equal to the fraction in those expressions. How large might this amplification effect be? Using the parameter values from the next section, which include a relatively modest level of adjustment costs, the scaling factor from these costs is about 1.75, which implies a cross-sectional variance in Empk that is scaled up by a factor of three relative to the case with no adjustment costs. Thus, although they do not change the qualitative predictions of the model, adjustment costs can have an important quantitative effect on the results. In contrast, the transitory effects coming from these costs will turn out to be small. Finally, how do adjustment costs change the relationship between expected mpk, beta and 33 For a related, but slightly different intuition, adjustment costs cause capital to be a long-lived asset and thus increase the length of the relevant time horizon when considering a capital investment. Because the amount of risk is increasing in the length of the horizon, the risk premium is naturally larger. 24 expected stock returns? Appendix D.4 shows that to a first-order, expected returns are not affected by adjustment costs and so the results from Section 3.1 continue to hold.34 Thus, the arguments made in that section linking the key parameters of the model to moments of asset returns go through unchanged. 4 Quantitative Analysis In this section, we use the analytical insights laid out above to numerically quantify the extent of mpk dispersion arising from risk premia effects. 4.1 Parameterization We begin by assigning values to the more standard production parameters of our model. Following Atkeson and Kehoe (2005), we set the overall returns to scale in production θ1 + θ2 to 0.85. We assume standard shares for capital and labor of 0.33 and 0.67, respectively, which gives θ1 = 0.28 and θ2 = 0.57. These values imply θ = 0.65.35 We assume a period length of one year and accordingly set the rate of depreciation to δ = 0.08. We estimate the adjustment cost parameter, ξ, in order to match the autocorrelation of investment, denoted corr (∆kt , ∆kt−1 ), which is 0.38 in our data. Equation (39) in Appendix D.5 provides a closed-form expression for this moment, which reveals a tight connection with the severity of adjustment frictions.36 To estimate the parameters governing the aggregate shock process, we build a long sample of Solow residuals for the US economy using data from the Bureau of Economic Analysis on real GDP and aggregate labor and capital. The construction of this series is standard (details in Appendix B.4). With these data, we use a standard autoregression to estimate the parameters ρx and σε2 . This procedure gives values of 0.94 and 0.0247 for the two parameters, respectively.37 34 Although this is only exactly true under our first-order approximation, Table 4 verifies numerically that at their estimated level, adjustment costs have relatively modest effects on moments of returns. 35 This is close to the values generally used in the literature. For example, Cooper and Haltiwanger (2006) estimate a value of 0.59 for US manufacturing firms. David and Venkateswaran (2019) use a value of 0.62. 36 The expression also reveals that for ρx close to ρz , which we find in the data, described next, the autocorrelation of within-firm investment is almost invariant to the firm’s beta (indeed, the invariance is exact if ρx = ρz ). Thus, even with dispersion in betas, we may not see large variation in this moment across firms. 37 The autoregression does not reject the presence of a unit root at standard confidence levels. We have also worked with the annual TFP series developed by John Fernald, available at: https://www.frbsf.org/economic-research/indicators-data/total-factor-productivity-tfp/. These data are only available for the more recent post-war period, but also show that the series is close to a random walk (i.e., the autocorrelation of growth rates is essentially zero). A potential concern with this approach is that these series reflect not only the process on exogenous technology, but also the effects of mpk dispersion itself (since dispersion affects measured aggregate productivity). However, at our estimates, these effects are small – mpk dispersion primarily impacts the level of aggregate productivity (which does not affect our estimates of persistence or volatility) but has only a small impact on its time-series properties (we 25 Under our assumptions, firm-level productivity (including the aggregate component) can be measured directly (up to an additive constant) as yit − θkit . After controlling for the level of aggregate productivity, a similar autoregression on the residual (firm-specific) component yields values for ρz and σε̃ of 0.93 and 0.28, respectively. Turning to the parameters of the SDF, we set ρ = 0.988 to match an average annual risk-free rate of 1.2%. Following the arguments in Section 3.1, we estimate the values of γ0 and γ1 to match the post-war (1947-2017) average annual excess return on the market index of 7.7% and Sharpe ratio of 0.53.38 This strategy is equivalent to matching both the mean and volatility of market excess returns (the standard deviation is 14.6%). To be comparable to the data, stock returns in the model need to be adjusted for financial leverage. To do so, we scale the where D is the mean and standard deviation of the model-implied returns by a factor of 1 + D E E debt-to-equity ratio. We follow, e.g., Barro (2006) and assume an average debt-to-equity ratio of 0.5. Because both the numerator and denominator are scaled by the same constant, the Sharpe ratio is unaffected. For ease of interpretation, in what follows, we report the properties of levered returns. To compute the model-implied market return, we must also take a stand on the mean beta across firms. Assuming that the mean of β̂i (the underlying productivity beta) is one, and using the value of ω (the sensitivity of wages to aggregate shocks) suggested by İmrohoroğlu and Tüzel (2014) of 0.20, we can compute the mean beta to be 1.99.39 This is simply the mean productivity beta adjusted for the leverage effects of labor liabilities. This procedure yields values of γ0 = 32 and γ1 = −140. Finally, again following the insights in Section 3.1, we estimate the dispersion in betas to match the cross-sectional dispersion in expected stock returns. To be consistent with the broad literature, we use the expected returns predicted from the Fama-French model as computed in Section 2.2. We de-lever firm-level expected returns following the approach in Bharath and Shumway (2008) and Gilchrist and Zakrajsek (2012) (details in Appendix B.2). This procedure yields an estimated average within-industry standard deviation of un-levered expected returns of 0.127 (we report details and plot the full histogram of the expected return distribution in Appendix B.2: for example, the mean is about 9%, and the interquartile range is just under 12%; the standard deviation of raw expected returns, i.e., not de-levered or controlling for industry, is about 0.156).40 Feeding this value into our quantitative model yields an estimate discuss these different effects in Section 4.2) – suggesting that these series are reasonable approximations to the exogenous process. Further, we have also constructed an alternative series that is free from this concern directly from the firm-level data by averaging across the firms in each year. This gives results quite similar to the baseline, ρx = 0.92 and σε = 0.0245. Details are in Appendix B.4. 38 We calculate these values using annualized monthly excess returns obtained from Kenneth French’s website, http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html. 39 İmrohoroğlu and Tüzel (2014) estimate this value to match the cyclicality of wages. 40 Our estimates are consistent with those in Lewellen (2015), who reports moments of the expected return 26 Table 3: Parameterization - Summary Parameter Production θ1 θ2 δ ξ σβ̂ Stochastic Processes ρx σε ρz σε̃ ω Stochastic Discount Factor ρ γ0 γ1 Description Capital share Labor share Depreciation rate Adjustment cost Std. dev. of risk exposures Value 0.28 0.57 0.08 0.04 4.80 Persistence of agg. shock Std. dev. of agg. shock Persistence of idiosyncratic shock Std. dev. of idiosyncratic shock Wage elasticity 0.94 0.0247 0.93 0.28 0.20 Time discount rate SDF – constant component SDF – time-varying component 0.988 32 -140 for σβ of 12, and adjusting for the scaling 1 − θ2 gives the dispersion in underlying productivity betas, σβ̂ , equal to 4.80.41 We parameterize the model using simulated method of moments (details in Appendix E). Table 3 summarizes our empirical approach/results. 4.2 Risk-Based Dispersion in MPK Table 4 presents our main quantitative results. We report four variants of the framework. Column (1) (“Baseline”) corresponds to the full model with time-varying risk and adjustment costs. Column (2) (“No Risk”) shows results from a version where risk effects are completely absent (specifically, we set γ0 = γ1 = 0). Column (3) (“Only Risk”) reports the effects of risk premia alone, without adjustment costs (i.e., ξ = 0). Column (4) (“Constant Risk”) examines a version with adjustment costs but a constant price of risk (i.e., γ1 = 0). Column (5) (“Only distribution from a number of predictive models. For example, using monthly data, he finds an annualized cross-sectional standard deviation of up to 17.5% (Model 3, Panel A, Table 5 of that paper). 41 Although this a significant amount of dispersion, it composes only a modest fraction of overall dispersion in σ 2 β̂ x2t + firm-level productivity. To see this, note that the cross-sectional variance of productivity at time t is 1−θ 2 σ2 σz2 , where σz2 = 1−ρε̃ 2 . Plugging in our estimates and assuming, for example, that the economy is 2% above z or below trend, gives the first term to be about 8% of the total. It remains relatively modest for reasonable deviations from trend. Thus, despite firms’ diverse sensitivities to business cycle shocks, our estimates still point to firm-level idiosyncratic conditions as the dominate factor driving cross-sectional heterogeneity. 27 Table 4: Risk Premia and Misallocation MPK Implications 2 EσEmpk 2 % of total σmpk 2 σEmpk 2 % of total σmpk Baseline (1) No Risk (2) Only Risk (3) Constant Risk (4) Only Constant Risk (5) 0.17 37.9% 0.14 0.03 7.4% 0.00 0.05 11.5% 0.05 0.16 35.9% 0.13 0.05 10.4% 0.05 47.3% 0.1% 15.7% 41.9% 15.7% ∆a 2 corr σEmpk , x t t 0.07 0.01 0.02 0.07 0.02 −0.31 0.01 −0.97 0.45 0.00 Moments e Erm ESRm corr (∆kt , ∆kt−1 ) 0.08 0.53 0.38 0.00 0.00 0.38 0.10 0.61 −0.02 0.05 0.52 0.38 0.06 0.65 −0.03 Constant Risk”) has a constant price of risk and no adjustment costs. Our goal in showing these different permutations is to understand the role that each element of the model plays in leading to various patterns in mpk dispersion. In order to interpret the results as decomposition, each variant holds all the parameters fixed at their estimated values except the one under study. Long-run effects. The first row of the table shows the average level of mpk dispersion that arises in each variant of the model.42 The second row shows the percentage of total observed 2 misallocation that this value accounts for. In our sample, overall σmpk is 0.45. This is the denominator in that row. Next, we calculate the dispersion stemming from only the permanent component of firm-level MPK deviations (given by equation (28)), which we report in the third row of the table. To compute this value in the data, for each firm, we regress the time-series of its mpk on a firm-level fixed effect. The fixed effect is the permanent component of firm-level mpk and the residuals transitory components. We then compute the variance of the permanent 2 = 0.30, about two-thirds of the total.43 This is component, which yields a value of σmpk the denominator in the fourth row of the table, which displays the model-implied permanent dispersion as a percentage of the observed permanent component in the data. The next row quantifies the implications of the estimated dispersion for the long-run level of aggregate TFP. It reports the gains in the average level of TFP from eliminating the predicted mpk dispersion, 42 With adjustment costs, we do not have analytic expressions for period-by-period Empk dispersion. We compute these values using simulation and then average over them. Without adjustment costs, we can use expression (17) directly. 43 Other approaches give a similar breakdown, see, e.g., David and Venkateswaran (2019). 28 denoted ∆a.44 This is essentially an application of expression (18). Column (1) shows that the full model generates mpk dispersion of about 0.17. This accounts for about 38% of overall mpk dispersion in the data.45 Of the model-implied dispersion, about 0.14 is permanent in nature, which explains about 47% of the permanent component in the data. The costs of this dispersion represent a loss in long-run TFP of about 7%. In the full model, both risk effects and adjustment costs lead to dispersion in Empk. To hone in on the role of risk alone, column (2) shows the same statistics when we eliminate risk effects and only adjustment costs are present. Adjustment costs on their own generate relatively modest 2 dispersion in Empk (EσEmpk = 0.03) and as proved in Section 3.2, do not lead to any dispersion 2 = 0). Thus, risk premia effects are crucial to generating in the permanent component, (σEmpk the substantial and persistent dispersion in column (1). Subtracting column (2) from column (1) captures the contribution of risk effects alone: Empk dispersion of 0.14 (about 30% of the total in the data), dispersion in the permanent component of 0.14 (47% of the data) and long-run TFP losses of 6%. These results suggest (i) heterogeneity in risk premia can generate significant MPK dispersion, particularly when compared to the permanent component in the data, and (ii) the consequences for measures of aggregate performance such as TFP – i.e., the “productivity costs” of business cycles – can be substantial. Column (3) removes adjustment costs to illustrate their amplification of existing risk premia effects. On their own (i.e., without adjustment costs), risk premia generate mpk dispersion of 2 0.05, which accounts for 11.5% of total σmpk in the data and explain about 16% of the permanent component. Thus, although the impact of risk premia remains significant in isolation, they are less than half of those in column (1), where the amplification from adjustments costs is taken into account. TFP losses are also smaller, but remain significant, at approximately 2%. Columns (4) and (5) show that the majority of these effects stems from the presence of a high persistent component in the price of risk, i.e., γ0 , rather than from the time-variation from γ1 . Setting γ1 = 0 only modestly reduces the size of these effects in the presence of adjustment costs (compare columns (1) and (4)) and has a negligible effect on the results without them (columns (3) vs. (5)). The implication is that time-variation in the price of risk does not add much to the long-run level of mpk dispersion. Countercyclical dispersion. The last row in the top panel examines the second main implication of the theory, namely, the countercyclicality of mpk dispersion, which we measure as the 2 correlation of σEmpk with the state of the business cycle, xt . Column (1) shows that the full t 44 This calculation does not mean that policies eliminating this source of mpk dispersion would necessarily be desirable. We merely see this as a useful way to quantify the implications of our findings. 45 The model-implied ratio of the standard deviation of MPK to that in expected stock returns is about 3.25 1 2 ( .17 .127 ), which is close to the empirical estimates in Section 2.2. 29 2 model generates significantly countercyclical dispersion in Empk – the correlation of σEmpk t with the state of the cycle is -0.31. To put this figure in context, Table 6 in Appendix B.3 2 and the cyclical component of aggregate productivity shows that the correlation between σmpk in the data is -0.27. Thus, the model predicts countercyclical dispersion on par with this value. Column (2) shows that adjustment costs alone do not generate any cyclicality in Empk dispersion. Column (3) shows that as the only factor behind Empk dispersion, the time-varying risk premium would lead to an almost perfectly negative correlation with the business cycle. This is a clear implication of equation (16). The additional presence of adjustment costs in the first column confounds this relationship and leads to a smaller correlation (in absolute value) that is more in line with the data. Finally, the last two columns illustrate that time-varying risk is key to generating countercyclical dispersion. Without this element, Empk dispersion is significantly positive with adjustment costs and without them, is exactly acyclical. Thus, our findings suggest that the interaction of a countercyclical price of risk with adjustment frictions is crucial in yielding a negative (though far from negative one) correlation between Empk dispersion and the state of the business cycle. To highlight the potential implications of the countercyclical Empk dispersion produced by our model, consider the connection with the empirical findings in Eisfeldt and Rampini (2006), who show that firm-level dispersion measures tend to be countercyclical, yet most capital reallocation is procyclical. Our theory can – at least in part – reconcile this observation due to the countercyclical nature of factor risk prices and the high beta of high MPK firms: countercyclical reallocation would entail moving capital to the riskiest of firms in the riskiest of times. Thus, in light of our results, it may not be as surprising that countercyclical dispersion obtains, even in a completely frictionless environment.46 Moments. In the bottom panel of Table 4, we investigate the role of each element in matching the target moments. Our full model in column (1) is directly parameterized to match the three moments, i.e., the equity premium, Sharpe ratio and autocorrelation of investment. Column (2) shows that without risk aversion, risk premia are essentially zero. Column (3) shows that, as implied by the approximation in Section 3.2, adjustment costs have a modest effect on the properties of returns (eliminating them somewhat raises the equity premium and Sharpe ratio). However, the autocorrelation of investment falls dramatically without them, indeed, becoming slightly negative (due to the mean-reverting nature of shocks). Thus, some degree of adjustment costs is crucial for matching this moment. Comparing columns (1) and (4) shows that without 46 The main measure of reallocation in Eisfeldt and Rampini (2006) includes both mergers and acquisitions (M&A) as well as sales of disassembled capital (sales of property, plant and equipment). Even excluding M&A, they find the latter is significantly procyclical (correlation with GDP of about 0.4; data from https: //sites.google.com/site/andrealeisfeldt/home/capital-reallocation-and-liquidity). 30 time-varying risk, the model struggles to match the equity premium, which falls almost by half, from about 8% to 5%. As implied by expressions (24), (23) and (39), time-varying risk is tightly linked to average excess returns, but has only modest effects on the average Sharpe ratio and the autocorrelation of investment. A similar pattern emerges from columns (3) and (5) – in the absence of adjustment costs, removing time-varying risk significantly reduces the equity premium but has smaller effects on the other two moments. In sum, the results in Table 4 show first, heterogeneity in firm-level risk premia leads to quantitatively important dispersion in mpk, with significant adverse effects on aggregate TFP; moreover, much of this dispersion is persistent and can account for a significant portion of what seems to be a puzzling pattern in the data, namely, persistent mpk deviations at the firm-level. Second, these risk premium effects add a notably countercyclical element to mpk dispersion, going some way towards reconciling the countercyclical nature of firm-level dispersion measures. 4.3 Other Distortions Recent work has pointed to a number of additional factors (beyond fundamentals and adjustment frictions) that may affect firms’ investment decisions and lead to mpk dispersion, for example, financial frictions or policy-induced distortions. Moreover, it has been pointed out that attempts to identify one of these forces – while abstracting from others – may yield misleading conclusions. This section demonstrates that our strategy of using asset market data is robust to this critique. In other words, our approach yields accurate estimates of risk premium effects, even in the presence of other, un-modeled, distortions. Rather than take a stand on the exact nature of these factors, we follow the broad literature, e.g., Hsieh and Klenow (2009) and Restuccia and Rogerson (2008), and model these distortions using a flexible class of “taxes” or “wedges,” which can have a rich correlation structure over time and with both firm-level characteristics and aggregate conditions (in Section 5 we analyze two additional sources of measured mpk dispersion, namely, heterogeneity in markups and production function parameters). Specifically, we introduce a proportional “tax” on firm-level operating profits, 1 − eτit+1 (so that the firm keeps a portion eτit+1 ), of the form: τit+1 = −ν1 zit+1 − ν2 xt+1 − ν3 βi xt+1 − ηit+1 . (31) The first term captures a component correlated with the firm’s idiosyncratic productivity, where the strength of the relationship is governed by ν1 . If ν1 > 0, the wedge discourages (encourages) investment by firms with high (low) idiosyncratic productivity. If ν1 < 0, the opposite is true. The next two terms capture the correlation of the wedge with the state of the business cycle, xt . We allow for a component through which all firms are equally distorted by the cyclical 31 portion of the wedge, captured by ν2 , and a component through which high beta firms are disproportionately affected by the cyclical portion, captured by ν3 . Through this piece, the wedge can be correlated with firm-level betas. The last term, ηit+1 , captures factors that are uncorrelated with firm or aggregate conditions. It can be either time-varying or fixed and is normally distributed with mean zero and variance ση2 . Low (high) values of η spur (reduce) investment by firms irrespective of their underlying characteristics or the state of the business cycle.47 David and Venkateswaran (2019) show that a related formulation describes observed MPK dispersion well (although they do not have beta dispersion or aggregate shocks). We loosely refer to the wedge as a “distortion,” although we do not take a stand on whether it stems from efficient factors or not, simply that there are other frictions in the allocation process. To gain intuition, we analyze each component of the distortion in turn. First, we focus only on the first and last terms, i.e., we set ν2 = ν3 = 0. In this case, the wedge is purely idiosyncratic in the cross-section, i.e., it is always mean zero and has no aggregate component. This formulation is closest to the ones typically used in the literature, which has typically focused on idiosyncratic distortions with no aggregate shocks. Appendix G derives the following expressions for expected mpk and its cross-sectional variance: Empkit+1 = α + ν1 ρz zit + ηit+1 + βi γt σε2 , ⇒ 2 σEmpk = (ν1 ρz )2 σz2 + ση2 + σβ2 γt σε2 t 2 . (32) In this case, Empk includes (i) a component that reflects the correlated distortion, ν1 , and depends on the firm’s expectations of its idiosyncratic productivity (ρz zit ), leading to mpk deviations that are correlated with idiosyncratic productivity, and (ii) a term that reflects the uncorrelated distortion, η, which leads to mpk deviations that are uncorrelated with productivity. The last term reflects the risk premium. All of these components lead to dispersion in Empk (dispersion in realized mpk also reflects uncertainty over shocks). Crucially, expression (32) reveals that the risk premium (and resulting risk-based dispersion) are unaffected by the presence of these additional distortions. Further, Appendix G proves that expected stock returns are also unaffected, i.e., equation (21) still holds. The result implies that the mapping from expected returns to beta is, to a first-order, unaffected by the distortions, as is the mapping from beta dispersion to its effects on Empk. This leads to an important finding: even in the richer environment here featuring a common class of mis-allocative distortions, using stock market data continues to yield accurate estimates of the effects of heterogeneous risk exposures alone. Clearly, a strategy using mpk dispersion directly does not share this feature: measuring risk effects alone would be complicated by the presence of other distortions. 47 We have also studied a version where the distortion is size-dependent, i.e., τit+1 = −νk kit+1 . This turns νk out to be equivalent to the specification in (31) where ν1 = ν3 = 1−θ+ν and ν2 = 0. k 32 Next, we add the components that are correlated with aggregate conditions. First, consider the case with a common cyclical component, i.e., ν2 6= 0. We can prove a similar result as with only idiosyncratic wedges – the distortion does not affect the cross-sectional dispersion in expected stock returns or the risk-related dispersion in Empk. Finally, consider the case where high beta firms are disproportionately affected by the aggregate distortion, i.e., ν3 6= 0. If ν3 > 0, the distortion discourages (encourages) investment by high (low) beta firms in good times and the reverse in bad times (in this sense, it works like a cyclical productivity-dependent component, since high beta firms are relatively more productive in good times). There is also an aggregate implication of the wedge: averaging across firms gives τ̄t+1 = −ν3 β̄xt+1 . If ν3 > 0 (< 0), the tax is pro- (counter) cyclical. Empk is given by: Empkit+1 = α + ν1 ρz zit + ν3 βi ρx xt + (1 − ν3 ) βi γt σε2 + ηit+1 . (33) The second and third terms captures the effects of idiosyncratic and aggregate distortions, respectively, and are independent of risk. The second to last term captures the risk premium, which is now scaled by a factor 1 − ν3 . Further, we can show that expected stock returns are scaled by exactly the same factor. The key implication is that all results from the baseline version go through, with a reinterpretation of the beta we recover from stock market data: rather than picking up the true beta alone, stock market returns yield a measure of the distorted beta, (1 − ν3 ) βi . Since this is the object that also determines the risk premia in MPK, the remainder of our results stay largely unaffected.48 4.4 Directly Measured Productivity Betas Our baseline approach to measuring firm-level risk exposures used the link between beta and expected stock returns laid out in Section 3.1. Here, we use an alternative strategy to estimate the dispersion in these exposures using only production-side data. In one sense, this approach is more direct – there is no need to employ firm-level stock market data to measure risk exposures. On the other hand, computing betas directly from production-side data has its drawbacks – the data are of a lower frequency (quarterly at best) and the time dimension of the panel is shorter. Further, it may be difficult to apply this method to firms in developing countries (where measured misallocation tends to be larger), since most firm-level datasets there have 48 One caveat is that when taking the cross-sectional variance of (33), an additional term arises from the covariance of the risk premium term with the aggregate distortion term. If the wedge worsens in downturns, i.e., ν3 < 0, which may be a plausible conjecture, we can prove that our baseline calculations yield a lower bound on risk premium effects on mpk dispersion. If the wedge is procyclical, i.e., ν3 > 0, we could be at risk of overstating these effects. However, Appendix G derives an upper bound on this bias at the estimated parameters and shows that it is quantitatively small. 33 relatively short panels and are at the annual frequency. For those reasons, we view our results here as an informative check on our baseline findings above. For each firm, we regress measured productivity growth, i.e., ∆zit + βi ∆xt , on aggregate productivity growth ∆xt . It is straightforward to verify that the coefficient from this regression is exactly equal to βi . Using these estimates, we can compute the firm’s underlying productivity beta, β̂i , and calculate the cross-sectional dispersion in these estimates, σβ̂2 . We have applied this procedure using three different measures of the aggregate shock: (i) our long sample of Solow residuals, (ii) the series we construct from firm-level data (both of these are described in Appendix B.4) and (iii) the Fernald annual TFP series. The results yield values of σβ̂ of 6.4, 4.3 and 5.9, respectively. Recall that our estimate for this value using stock return data was 4.8, which is in line with – and towards the lower end of – the range found here. 4.5 Measurement Concerns In this section, we address a number of potential measurement-related issues. First, following the recent literature, e.g., Hsieh and Klenow (2009) and Gopinath et al. (2017), we measure firm-level capital stocks using reported book values. An alternative approach is to use the perpetual inventory method along with detailed data on investment flows and investment good price deflators to construct capital stocks. Although this is in general an important issue for the firm dynamics/misallocation literatures, our empirical approach allows us to largely avoid this concern. To see this, notice that our estimation relies on measures of firm-level capital in only two places: first, to calculate the properties of idiosyncratic shocks, i.e., ρz and σε̃2 , and second, to calculate the autocorrelation of investment, which largely identifies the extent of adjustment costs. As shown, for example, in equations (16) and (22), idiosyncratic shocks have no effect on dispersion in expected mpk or on expected stock returns (the latter to a first-order). In our framework, idiosyncratic risk, though crucial in explaining firm dynamics, is not priced, and thus does not affect risk premia.49 David and Venkateswaran (2019) measure firm-level capital using both approaches and find a larger serial correlation of investment using the perpetual inventory method. Since our adjustment cost estimate is increasing in the serial correlation, this approach would likely lead to larger estimates, and, as shown above, further amplify the risk premium effects we uncover.50 Largely avoiding the use of firm-level capital measures is an 49 We have also verified that idiosyncratic shocks have little effect on our estimates in the full non-linear model (see Appendix I). 50 If the serial correlation is lower, we can generally think of the results in column 2 of Table 4, where we set adjustment costs to zero, as a lower bound. Although not part of our estimation, we also use firm-level capital to calculate total mpk dispersion, e.g., the denominator in the second row of Table 4. David and Venkateswaran (2019) show that this statistic is very similar under the two measurement approaches (see Tables 2 and 18 in that paper). 34 important feature of our use of stock market data.51 How about the effects of measurement error? Our use of stock market data is also useful in this regard – in general, stock market data should be quite precisely measured and so largely free of this concern. Measurement error in capital may affect our estimate of adjustment costs, but we can show that this error would likely lead us to a conservative estimate for these costs. To see this, consider first the case of (classical) measurement error that is iid over time. This unambiguously reduces the observed serial correlation (i.e., the true one is higher), which would yield higher adjustment cost estimates. Alternatively, consider the opposite case where the measurement error is permanent. Then, since we work with the growth rate of capital, our results would be unaffected. Of course, similarly to mis-measured capital, measurement error may affect the observed amount of mpk dispersion. These issues may also be concerns for our estimates of productivity betas in the previous section, where we used measures of capital to calculate firm-level productivity. However, in Appendix H we show that any potential bias is likely quite small. Loosely speaking, mis-measured capital introduces error into the dependent variable of the regression, which, under certain conditions, will not affect our estimates (specifically, so long as changes in the measurement error are uncorrelated with changes in aggregate productivity). In that appendix we also investigate the potential bias in those estimates coming from unobserved heterogeneity in parameters across firms, i.e., θ, and show that it is quite small. 5 The Sources of Betas Cross-firm variation in exposure to aggregate shocks, i.e., beta, is an essential ingredient in our theory. In this section, we investigate some potential sources of this type of heterogeneity – namely, dispersion in technological parameters (input elasticities in production) and markups as well as in the sensitivity of demand to business cycle fluctuations. Importantly, we show that each of these forms of heterogeneity is reflected in our measured betas, so that our main results on risk premia and mpk dispersion go through unchanged. Our goal here is simply to gain some further insight into why firms exhibit different sensitivities to aggregate shocks. Heterogeneous technologies/markups. Firm-level heterogeneity in production function parameters or markups are potential sources of beta dispersion. Intuitively, both of these forces lead firms to have different responsiveness and so exposure to aggregate shocks. In Appendix I, we explore each of these in detail (to allow for markup dispersion, we extend our baseline setup 51 Of course, some of the measured mpk dispersion in the data – i.e., the denominators in rows 2 and 4 of Table 4 may be coming from mis-measurement of capital. 35 to an environment where firms produce differentiated goods, are monopolistically competitive and face constant, but potentially heterogeneous, elasticities of demand). First, we show that a version of our analysis in Section 3 continues to hold in both cases, where the firm’s beta now also reflects these additional sources of heterogeneity. Second, we calculate how much of the observed beta dispersion can be attributed to each of these forces. Using dispersion in labor’s share of revenue as a likely upper bound for technology dispersion, we find it can potentially account for about 12% of the overall standard deviation of betas from Section 4. Similarly, using recent estimates of markup dispersion among Compustat firms, we find it can account for about 6%. Thus, in total, heterogeneity in input elasticities and markups are likely to explain at most about 18% of measured beta dispersion. Although this is a significant fraction, these findings also suggest that the majority of beta dispersion seems to arise from other sources.52 Heterogeneous demand sensitivities. A recent literature has pointed out variation in the response of firm-level demand to the business cycle. For example, Jaimovich et al. (2019) document a “trading down” phenomenon – during expansions, when purchasing power is high, households tend to consume higher quality goods and in downturns substitute towards lower quality ones – Nevo and Wong (2015) show that during the Great Recession, consumers substituted towards cheaper generic products and discount stores and Coibion et al. (2015) show that during downturns, consumers substitute towards low-price retailers.53 This pattern makes high quality products more procyclical and lower quality ones less so (or even countercyclical). To see the implications of those findings for our analysis, consider the following system of demand and production functions: µ Qit = Pit−µ Xtβ̂i Ẑit , Yit = Kitθ̂1 Nitθ̂2 . Here, Xt is interpreted as an aggregate component of demand rather than technology (it is straightforward to include aggregate technology shocks as well) and Ẑit as idiosyncratic demand. The firm-specific sensitivity to Xt , β̂i , captures the idea that in expansions, when demand for all goods is high, consumers substitute towards some goods and away from others. In downturns, when Xt is low, the opposite pattern holds: consumers substitute away from those same goods. This is a simple way to capture the “trading down” phenomenon. Firm revenues are given by Pit Yit = Xtβ̂i Ẑit Kitθ1 Nitθ2 , 52 Appendix I also investigates potential heterogeneity in the depreciation rate, δ, and the parameters governing idiosyncratic shocks, ρz and σε̃2 , as well as the effects of adjustment costs alone. We find that these forces are unlikely to account for much of the dispersion in risk premia. 53 A related literature documents a similar “flight from quality” in response to contractionary exchange rate devaluations. e.g., Burstein et al. (2005), Bems and Di Giovanni (2016) and Chen and Juvenal (2018). 36 where θj = 1 − µ1 θ̂j , j = 1, 2. With this reinterpretation, the expression is exactly equivalent to (8). In other words, differences in the responsiveness of firm-level demand to the business cycle may be behind our beta estimates. Since direct data on quality are hard to come by, systematically quantifying the dispersion in these “demand betas” is challenging. However, in Appendix I, we examine one industry where we were able to obtain a proxy for quality, namely average check per person (price) in SIC 5812, Eating Places (i.e., restaurants). Pricing data are from a number of publicly available sources, including company SEC filings and investment bank reports. The appendix shows that higher quality establishments, as proxied by price, have greater exposure to aggregate shocks, and higher expected stock returns and MPK. Thus, the main message of that study of a single industry is that differences in the cyclicality of firm-level demand due to quality differences and “trading down” seems a promising explanation for beta dispersion. 6 Conclusion In this paper, we have revisited the notion of “misallocation” from the perspective of a risksensitive, or risk-adjusted, version of the stochastic growth model with heterogeneous firms. The standard optimality condition for investment in this framework suggests that expected firm-level marginal products should reflect exposure to macroeconomic risks, and their pricing. To the extent that firms are differentially exposed to these risks, cross-sectional dispersion in MPK may not only reflect true misallocation, but also risk-adjusted capital allocation. We provide empirical support for this proposition and demonstrate that a suitably parameterized model of firm-level investment suggests that, indeed, risk-adjusted capital allocation accounts for a significant fraction of observed MPK dispersion among US firms. Importantly, much of this dispersion is persistent in nature, which speaks to the large portion of observed MPK dispersion that arises from seemingly persistent/permanent sources. Further, our setup leads to a novel link between aggregate volatility, risk premia and long-run productivity – our results suggest that there can be substantial “productivity costs” of business cycles. There are several promising directions for future research. Our framework points to a new connection between business cycle dynamics and the cross-sectional allocation of inputs. Investigation of this link, for example, a further exploration of the sources of beta variation across firms, would lead to a better understanding of the underlying causes of observed marginal product dispersion. Much of the misallocation literature examines differences in marginal product dispersion across countries. A natural next step would be to implement a similar analysis in a set of developing countries – because those countries typically have high business cycle volatility, it may be that dispersion in risk premia is larger there. The tractability of our setup allowed 37 us to quantify the effects of financial market considerations, e.g., cross-sectional variation in required rates of return, on measures of macroeconomic performance, i.e., aggregate TFP. This link provides a new way to evaluate the implications of the rich set of empirical findings in cross-sectional asset pricing. For example, pursuing multifactor/financial shock extensions of our analysis (e.g., along the lines laid out in Appendix F) to incorporate the many risk factors pointed out in that literature would be fruitful to measure the implications of those factors for allocative efficiency. Of particular interest would be whether those factors are efficient or not, e.g., to what extent do capital allocations reflect the “mis-pricing” of assets. References Alvarez, F. and U. J. Jermann (2004): “Using asset prices to measure the cost of business cycles,” Journal of Political economy, 112, 1223–1256. Asker, J., A. Collard-Wexler, and J. De Loecker (2014): “Dynamic inputs and resource (mis) allocation,” Journal of Political Economy, 122, 1013–1063. Atkeson, A. and P. J. Kehoe (2005): “Modeling and measuring organization capital,” Journal of Political Economy, 113, 1026–1053. Balvers, R. J., L. Gu, D. Huang, M. Lee-Chin, et al. (2015): “Profitability, value and stock returns in production-based asset pricing without frictions,” Journal of Money, Credit, and Banking. Barro, R. J. (2006): “Rare disasters and asset markets in the twentieth century,” The Quarterly Journal of Economics, 121, 823–866. Bartelsman, E., J. Haltiwanger, and S. Scarpetta (2013): “Cross Country Differences in Productivity: The Role of Allocative Efficiency,” American Economic Review, 103, 305– 334. Belo, F., X. Lin, and S. Bazdresch (2014): “Labor hiring, investment, and stock return predictability in the cross section,” Journal of Political Economy, 122, 129–177. Bems, R. and J. Di Giovanni (2016): “Income-induced expenditure switching,” American Economic Review, 106, 3898–3931. Bharath, S. T. and T. Shumway (2008): “Forecasting Default with the Merton Distance to Default Model,” Review of Financial Studies, 21, 1339–1369. 38 Binsbergen, J. V. and C. Opp (2017): “Real Anomalies,” Wharton Working Paper. Buera, F. J., J. P. Kaboski, and Y. Shin (2011): “Finance and Development: A Tale of Two Sectors,” American Economic Review, 101, 1964–2002. Burstein, A., M. Eichenbaum, and S. Rebelo (2005): “Large devaluations and the real exchange rate,” Journal of political Economy, 113, 742–784. Chen, N. and L. Juvenal (2018): “Quality and the great trade collapse,” Journal of Development Economics, 135, 59–76. Cochrane, J. (1991): “Production-Based Asset Pricing and the Link Between Stock Returns and Economic Fluctuations,” Journal of Finance, 46, 207–234. Coibion, O., Y. Gorodnichenko, and G. H. Hong (2015): “The cyclicality of sales, regular and effective prices: Business cycle and policy implications,” American Economic Review, 105, 993–1029. Cooper, R. W. and J. C. Haltiwanger (2006): “On the nature of capital adjustment costs,” The Review of Economic Studies, 73, 611–633. David, J. M., E. Henriksen, and I. Simonovska (2014): “The risky capital of emerging markets,” Tech. rep., National Bureau of Economic Research. David, J. M., H. A. Hopenhayn, and V. Venkateswaran (2016): “Information, Misallocation and Aggregate Productivity,” The Quarterly Journal of Economics, 131, 943–1005. David, J. M. and V. Venkateswaran (2019): “The Sources of Capital Misallocation,” American Economic Review, 109, 2531–67. Donangelo, A., F. Gourio, M. Kehrig, and M. Palacios (2018): “The cross-section of labor leverage and equity returns,” Journal of Financial Economics. Edmond, C., V. Midrigan, and D. Y. Xu (2018): “How costly are markups?” Tech. rep., National Bureau of Economic Research. Eisfeldt, A. and A. Rampini (2006): “Capital reallocation and liquidity,” Journal of Monetary Economics, 53, 369–399. Eisfeldt, A. L. and Y. Shi (2018): “Capital Reallocation,” Tech. rep., University of California, Los Angeles. 39 Fama, E. F. and K. R. French (1992): “Cross-Section of Expected Stock Returns,” The Journal of Finance, 47, 3247–3265. Fama, E. F. and J. D. MacBeth (1973): “Risk, Return, and Equilibrium: Empirical Tests,” Journal of Political Economy, 81, 607–636. Gilchrist, S., J. W. Sim, and E. Zakrajšek (2013): “Misallocation and financial market frictions: Some direct evidence from the dispersion in borrowing costs,” Review of Economic Dynamics, 16, 159–176. Gilchrist, S. and E. Zakrajsek (2012): “Credit Spreads and Business Cycle Fluctuations,” American Economic Review, 102, 1692–1720. Gomes, J. and L. Schmid (2010): “Levered Returns,” Journal of Finance, 65, 467–494. Gomes, J., A. Yaron, and L. Zhang (2006): “Asset pricing implications of firms financing constraints,” Review of Financial Studies, 19, 1321–1356. Gopinath, G., Ş. Kalemli-Özcan, L. Karabarbounis, and C. Villegas-Sanchez (2017): “Capital Allocation and Productivity in South Europe,” The Quarterly Journal of Economics, 132, 1915–1967. Guren, A. M., A. McKay, E. Nakamura, and J. Steinsson (2018): “Housing Wealth Effects: The Long View,” Tech. rep., Working Paper. Haltiwanger, J., R. Kulick, and C. Syverson (2018): “Misallocation measures: The distortion that ate the residual,” Tech. rep., National Bureau of Economic Research. Hopenhayn, H. A. (2014): “Firms, misallocation, and aggregate productivity: A review,” Annu. Rev. Econ., 6, 735–770. Hou, K., C. Xue, and L. Zhang (2015): “Digesting anomalies: An investment approach,” Review of Financial Studies. Hsieh, C. and P. Klenow (2009): “Misallocation and Manufacturing TFP in China and India,” Quarterly Journal of Economics, 124, 1403–1448. İmrohoroğlu, A. and Ş. Tüzel (2014): “Firm-level productivity, risk, and return,” Management Science, 60, 2073–2090. Jaimovich, N., S. Rebelo, and A. Wong (2019): “Trading down and the business cycle,” Journal of Monetary Economics. 40 Jones, C. S. and S. Tuzel (2013): “Inventory Investment and The Cost of Capital,” Journal of Financial Economics, 107, 557–579. Kehrig, M. (2015): “The Cyclical Nature of the Productivity Distribution,” Working paper. Kehrig, M. and N. Vincent (2017): “Do Firms Mitigate or Magnify Capital Misallocation? Evidence from Planet-Level Data,” Working Paper. Kennan, J. (2006): “A note on discrete approximations of continuous distributions,” University of. Kogan, L. and D. Papanikolaou (2013): “Firm characteristics and stock returns: The role of investment-specific shocks,” The Review of Financial Studies, 26, 2718–2759. Lewellen, J. (2015): “The Cross-section of Expected Stock Returns,” Critical Finance Review, 4, 1–44. Lewellen, J. and S. Nagel (2006): “The conditional CAPM does not explain asset-pricing anomalies,” Journal of Financial Economics, 82, 289–314. Liu, L. X., T. M. Whited, and L. Zhang (2009): “Investment-based expected stock returns,” Journal of Political Economy, 117, 1105–1139. Lucas, R. E. (1987): Models of Business Cycles, vol. 26, Basil Blackwell Oxford. Midrigan, V. and D. Y. Xu (2014): “Finance and misallocation: Evidence from plant-level data,” The American Economic Review, 104, 422–458. Moll, B. (2014): “Productivity losses from financial frictions: can self-financing undo capital misallocation?” The American Economic Review, 104, 3186–3221. Nevo, A. and A. Wong (2015): “The elasticity of substitution between time and market goods: Evidence from the Great Recession,” Tech. rep., National Bureau of Economic Research. Novy-Marx, R. (2013): “The other side of value: The gross profitability premium,” Journal of Financial Economics, 108, 1–28. Peters, M. (2016): “Heterogeneous Mark-Ups, Growth and Endogenous Misallocation,” Working Paper. Restoy, F. and M. Rockinger (1994): “On Stock Market Returns and Returns on Investment,” Journal of Finance, 49, 543–556. 41 Restuccia, D. and R. Rogerson (2008): “Policy Distortions and Aggregate Productivity with Heterogeneous Establishments,” Review of Economic Dynamics, 11, 707–720. ——— (2017): “The causes and costs of misallocation,” Journal of Economic Perspectives, 31, 151–74. Zhang, L. (2005): “The Value Premium,” Journal of Finance, 60, 67–103. ——— (2017): “The Investment CAPM,” European Financial Management. 42 Appendix: For Online Publication A A.1 Motivation Derivation of equation (3) and examples. 1 = Et [Mt+1 (M P Kit+1 + 1 − δ)] = Et [Mt+1 ] Et [M P Kit+1 + 1 − δ] + covt (Mt+1 , M P Kit+1 ) The (gross) risk-free rate satisfies Rf t = 1 . Et [Mt+1 ] Combining and rearranging yields Et [M P Kit+1 ] = M P Kf t+1 − covt (Mt+1 , M P Kit+1 ) Et [Mt+1 ] = αt + βit λt where αt , βit and λt are as defined in the text. No aggregate risk. With no aggregate risk, Mt+1 = ρ ∀ t where ρ is the rate of time discount. The Euler equation gives 1 = ρ (Et [M P Kit+1 ] + 1 − δ) ∀ i, t ⇒ Et [M P Kit+1 ] = 1 − (1 − δ) = rf + δ ρ CAPM. Clearly, −cov (Mt+1 , M P Kit+1 ) = bcov (rmt+1 , M P Kit+1 ) and var (Mt+1 ) = b2 var (rmt+1 ). Since the market return is an asset, it must satisfy Et [rmt+1 ] = rf t + λbt so that λt = b (Et [rmt+1 ] − rf t ). Substituting into expression (3) gives the CAPM expression in the text. CRRA preferences. A log-linear approximation to the SDF around its unconditional mean gives Mt+1 ≈ E [Mt+1 ] (1 + mt+1 − E [mt+1 ]) and in the case of CRRA utility, mt+1 = −γ∆ct+1 where ∆ct+1 is log consumption growth. Substituting for Mt+1 into expression (3) gives the CCAPM expression in the text. A.2 Extensions Alphas. We model alphas as firm-level distortions in discount rates. Specifically, we assume the payoffs of firm i are discounted using M̃t+1 = Mt+1 Tit+1 43 where Tit+1 denotes a firm-specific distortion to the discount factor. The Euler equation then takes the form h i h i 1 = (1 − δ) Et M̃t+1 + Et M̃it+1 M P Kit+1 Applying a similar approach as above and, for simplicity, assuming that expected discount factors are undistorted, we obtain Et [M P Kit+1 ] = αt + βit λt covt (M P Kit+1 ,M̃it+1 ) . Thus, even if where αt and λt are as defined in expression (3) and βit = − vart (Mt+1 ) all firms have the same dynamic process for MPK, dispersion in expected MPK can arise from differences in the stochastic processes of the discount factors, M̃it+1 . Firm-specific investment prices. Here, we allow firms to face different prices of capital, denoted Qit , so that the cost of new investment in period t is equal to Qit (Kit+1 − (1 − δ) Kit ). The Euler equation is given by Qit = Et [Mt+1 (M P Kit+1 + Qit+1 (1 − δ))] (34) and rearranging, Et [M P Kit+1 ] = Qit Rf t − (1 − δ) Et [Qit+1 ] − covt (M P Kit+1 + (1 − δ) Qit+1 , Mt+1 ) Et [Mt+1 ] = αit + βit λt where αit = Qit Rf t − (1 − δ) Et [Qit+1 ] is the (now firm-specific) risk-free cost of capital, +(1−δ)Qit+1 ,Mt+1 ) t (Mt+1 ) and λt = var . Thus, even if all firms have the same βit = − covt (M P Kit+1 vart (Mt+1 ) Et [Mt+1 ] co-movement of MPK with the SDF, i.e., covt (M P Kit+1 , Mt+1 ) is constant across firms, heterogeneity in the co-movement of Qit with the SDF will lead to differences in risk premia across firms and hence in expected MPK. A.3 MPK and Stock Returns To derive equation (5), use the Euler equation 1 = Et [Mt+1 (M P Kit+1 + 1 − δ)] = (1 − δ) Et [Mt+1 ] + Et [Mt+1 M P Kit+1 ] 44 Assume that M P K and Mt+1 are jointly log-normal and use the fact that the risk-free rate satisfies Rf t = Et [M1t+1 ] to obtain 1 Rf t = (1 − δ) + eEt [mpkit+1 ]+ 2 vart (mpkit+1 )+covt (mpkit+1 ,mt+1 ) or, rearranging and suppressing variance terms for simplicity, e Empkit+1 ≡ Et [mpkit+1 ] − log (rf t + δ) ≈ −covt (mpkit+1 , mt+1 ) To derive equation (6), standard techniques give the risk premium on stocks as e Erit+1 ≡ Et [rit+1 ] − rf t ≈ −covt (rit+1 , mt+1 ) With a single source of aggregate risk, a first order approximation gives the return as its expected value plus terms that are linear in the unexpected shocks, i.e., rit+1 = Et [rit+1 ] + ψε εit+1 + ψβit εt+1 where ψε and ψ are constants of linearization and βi captures the firm-specific exposure to the aggregate shock (ψε and εit can either be scalars, or vectors of exposures and realizations of idiosyncratic shocks). Similarly, mpk satisfies mpkit+1 = Et [mpkit+1 ] + εit+1 + βit εt+1 Then, covt (rit+1 , mt+1 ) = ψcovt (mpkit+1 , mt+1 ) Substituting yields e ≈ −ψcovt (mpkit+1 , mt+1 ) Erit+1 For a detailed derivation of these results, see the approach in Appendix D.1 and D.4. For a multifactor version, see Appendix F.1. B Data In this appendix, we describe the various data sources used throughout our analysis. 45 B.1 Sources and Series Construction We obtain firm-level data from COMPUSTAT and CRSP.54 We include firms coded as industrial firms from 1965-2015. Our time-series regressions and portfolio sorts use data from 1973-2015, since data on the GZ spread and excess bond (EB) premium begin in 1973 and because there are relatively few industries with at least 10 firms in a given year pre-1973.55 We further exclude financial firms by dropping those with COMPUSTAT SIC codes that correspond to finance, insurance, and real estate (FIRE, SIC codes 6000-6999). We also exclude firms with missing SIC codes or coded as non-classifiable, as much of our analysis examines within-industry variables. We measure firm revenue using sales from Compustat (series SALE), and capital using the depreciated value of plant, property, and equipment (series PPENT). We measure firm marginal product of capital in logs (up to an additive constant) as the difference between log revenue and capital, mpkit = yit − kit . Market capitalization is measured as the price times shares outstanding from CRSP and profitability as the ratio of earnings before interest, taxes, depreciation, and amortization (EBITDA) divided by book assets (AT). We measure market leverage as the ratio of book debt to the sum of market capitalization plus book debt, where book debt is measured as current liabilities (LCT) + 1/2 long term debt (DLTT), following Gilchrist and Zakrajsek (2012). We obtain data on aggregate risk factors from the following sources. Data on the FamaFrench factors are from Kenneth French’s website, http://mba.tuck.dartmouth.edu/pages/ faculty/ken.french/, while the Hou et al. (2015) q 5 factors are from http://global-q.org/ factors.html. Updated data on the price/dividend ratio are from Robert J. Shiller’s website, http://www.econ.yale.edu/~shiller/ and updated measures of the GZ spread and excess bond premium are from Simon Gilchrist’s website, http://people.bu.edu/sgilchri/. Computation of betas and expected returns. Here, we describe our procedure to compute betas and expected returns. We estimate stock market betas by performing time-series regressions of firm-level excess returns (realized returns from CRSP in excess of the risk-free rate), rite , on aggregate factors, denoted by the N × 1 vector Ft . For each firm, the specification takes the form rite = αiτ + βiτ Ft + it (35) We estimate these regressions at the quarterly frequency using backwards-looking five-year rolling windows, i.e., for t ∈ {τ − Nτ + 1, τ − τT + 2, ..., τ }, where βiτ denotes the 1 × N vector 54 Source: CRSP® , Center for Research in Security Prices, Booth School of Business, The University of Chicago. Used with permission. All rights reserved. 55 The results are qualitatively similar if we use data from the full 1965-2015 sample. 46 of factor loadings and Nτ the length of the window.56 Under the CAPM, the single risk factor is the aggregate market return. Under the Fama-French 3 factor model, the risk factors are the market return (MKT), the return on a portfolio that is long in small firms and short in large ones (SMB) and the return on a portfolio that is long in high book-to-market firms and short in low ones (HML). Under the Hou et al. (2015) q 5 5 factor model, the risk factors are the market return, the return on a portfolio that is long in small firms and short in large ones, the return on a portfolio that is long in low investment firms and short in high investment ones, the return on a portfolio that is long in high profitability (return on equity) firms and short in low profitability ones and the return on a portfolio that is long in firms with high expected 1-year ahead investment-to-assets changes and short in firms with low ones. Next, we estimate the following cross-sectional regression in each period: rite = αt + λt βit + it (36) where λt denotes the 1 × N vector of period t factor risk prices and βit the N × 1 vector of exposures, estimated as just described. We calculate expected stock returns as αi + λβit , where βit is as estimated from equation (35), λ is calculated using the estimates from (36), and αi is P calculated as αi = T1 Tt=1 (αit + it ) also using the estimates from (36). Composition-adjusted measures of dispersion. For Table 2, we compute time-series of the cross-sectional dispersion in MPK. Because Compustat is an unbalanced panel with significant changes in the composition of firms over time, it is important to ensure that we measure the variation in dispersion due to changes in firm MPK, rather than additions or deletions from the dataset (especially since many additions and deletions to the Compustat data may not be true firm entry or exit). We therefore compute composition-adjusted measures of the cross-sectional standard deviation in MPK that are only affected by firms who continue on in the dataset. We use the following procedure: For each set of adjacent periods, e.g., t and t + 1, we compute the cross-sectional standard deviation in each time period only for those firms that are present in the data in both periods. Taking the difference yields the change from time t to t + 1 that is due only to changes in the common set of firms. Completing this procedure yields a time-series of changes in the cross-sectional standard deviation of MPK. We then combine this time-series of changes with the initial value of the standard deviation (across all firms in the initial period) to construct a synthetic series for the standard deviation that is not affected by the changing composition of firms in the data. 56 We have also estimated the stock market betas using higher frequency monthly data (and two-year rolling windows) and obtained similar results. 47 B.2 Expected Return Distribution Table 5 reports statistics from the cross-sectional distribution of expected returns (E[re ]) and unlevered expected returns (E[ra ]), which is a measure of expected asset returns, estimated from the Fama-French model. We de-lever expected returns using an adjustment factor computed from Black-Scholes following the approach in, e.g., Bharath and Shumway (2008) and Gilchrist and Zakrajsek (2012). Specifically, we implement an iterative procedure using data on realized equity volatility, firm debt, and firm market capitalization to compute the implied value of assets and asset volatility. The Black-Scholes equations imply E[ra ] ≈ M kt.VAcap. Φ(δ1 )E[re ], where VA is the total firm asset value implied by Black-Scholes as a function of the market capitalization of equity, book debt, and realized backwards-looking equity volatility and Φ (δ1 ) is the BlackScholes “delta” of equity, as defined in, e.g., Gilchrist and Zakrajsek (2012). We compute the cap. Φ(δ1 ) for each firm using daily data and a 21 day backwards-looking adjustment factor M kt. VA window for equity volatility and then calculate a firm-year adjustment factor by averaging this adjustment factor for each firm-year. Finally, we compute un-levered expected returns for each firm as the product of its expected equity return multiplied by this factor. To find the cross-sectional distribution of within-industry expected returns, we de-mean expected returns by industry-year, keeping industry-years with at least 10 observations. We then add back the means and report the resulting distribution.57 Figure 1 plots the full cross-sectional distribution of within-industry expected excess asset and equity returns. Table 5: The Distribution of Expected Excess Returns Percentile 10th E[ra ] E[re ] -3.6% -5.3% E[ra ] E[re ] -3.6% -4.6% 25th Mean 75th Panel A: Not Industry-Adjusted 4.0% 9.8% 17.1% 6.6% 12.1% 20.6% Panel B: Industry-Adjusted 4.7% 9.8% 16.6% 6.6% 12.1% 20.3% 90th Std. Dev. 24.7% 28.6% 13.2% 15.6% 23.6% 28.1% 12.7% 15.0% Notes: This table reports the cross-sectional distributions of un-levered expected excess equity returns, E[ra ], and expected excess equity returns, E[re ]. Industry adjustment is done by demeaning each measure of expected returns by industry-year. We then add back the mean returns to these distributions. B.3 Time-Series Correlations Table 6 reports contemporaneous correlations between (within-industry) MPK dispersion and indicators of the price of risk and the business cycle. 57 The results are similar if we compute our cross-sectional statistics within each year or industry-year and average over the years/industry-years 48 0 0 1 2 2 3 4 4 5 (b) E[re ] 6 (a) E[ra ] -.5 -.25 0 Histogram .25 .5 -.5 Kernal Density -.25 0 Histogram .25 .5 Kernal Density Figure 1: Cross-Sectional Distribution of Expected Excess Returns Notes: This figure displays the cross-sectional distributions of un-levered expected excess equity returns, E[ra ], and expected excess equity returns, E[re ]. Industry adjustment is done by demeaning each measure of expected returns by industry-year. We then add back the mean returns to these distributions. The vertical bars denote the histograms of these distributions, while the solid lines are the results of kernel smoothing regressions with a bandwidth of 0.25. B.4 Aggregate Productivity Series Solow residuals. To build a series of Solow residuals, we obtain data on real GDP and aggregate labor and capital from the Bureau of Economic Analysis. Data on real GDP are from BEA Table 1.1.3 (“Real Gross Domestic Product”), data on labor are from BEA Table 6.4 (“Full-Time and Part-Time Employees”) and data on the capital stock are from BEA Table 1.2 (“Net Stock of Fixed Assets”). The data are available annually from 1929-2016. With these data we compute xt = yt − θ1 kt − θ2 nt . We extract a linear time-trend and then estimate the autoregression in equation (9). Firm-level series. To construct the alternative series for aggregate productivity from the firm-level data, we use the following procedure. First, we compute firm-level productivity as zit + βi xt = yit − θkit . We then average these values across all firms in each year. Because zit is mean-zero and independent across firms, this yields a scaled measure of aggregate productivity, β̄xt , where β̄ is the mean beta across firms, which under our assumptions, is approximately two. We extract a linear time-trend from this series and then estimate the autoregression. The coefficient from this regression gives ρx . The standard deviations of the residuals gives β̄σε and after dividing by β̄ gives the true volatility of shocks. Applying this procedure to the set of Compustat firms over the period 1962-2016 yields values of ρx = 0.92 and σε = .0245. 49 Table 6: Correlations of MPK Dispersion, the Price of Risk and the Business Cycle MPK Dispersion PD Ratio GZ Spread EB Premium GDP TFP MPK Dispersion 1.00 -0.42 0.39 0.51 -0.53 -0.27 PD Ratio GZ Spread EB Premium GDP TFP 1.00 -0.51 -0.57 0.46 0.43 1.00 0.68 -0.59 -0.32 1.00 -0.66 -0.44 1.00 0.70 1.00 Notes: This table reports time-series correlations of MPK dispersion, measures of the price of risk and the business cycle. MPK dispersion is measured as the within-industry standard deviation in mpk. The PD ratio is the aggregate stock market price/dividend ratio. The GZ spread and EB (excess bond) premium are measures of credit spreads. GDP is log GDP and TFP is log TFP. We extract the cyclical components of GDP, TFP and the PD ratio using a one-sided Hodrick-Prescott filter. All series are described in more detail in the main text and Appendix B.1. All data are quarterly and are from 1973-2015. C Additional Empirical Results MPK and stock returns. We perform two additional exercises examining the link between MPK, stock market returns and exposure to aggregate risk. First, we verify that high MPK firms tend to offer higher expected stock market returns. To do so, we group firms into five bins, or portfolios, based on their MPK and assess whether the high MPK groups tend exhibit higher stock market returns than the low MPK groups. We sort firms into five portfolios based on their year t MPK, where portfolio 1 contains low MPK firms and portfolio 5 high MPK ones. The portfolios are rebalanced annually. We then compute four versions of the equal-weighted stock excess stock return to each portfolio: the contemporaneous return, denoted rte , the one-period e e , and the one-period ahead unlevered , the three-period ahead return, rt+3 ahead return, rt+1 M ktcap a a e = M ktcap+Debt , which we calculate using an unlimited liability model, rt+1 return, rt+1 rt+1 .58 We also compute the excess return on a high-minus-low MPK portfolio (MPK-HML), which is an annually rebalanced portfolio that is long on stocks in the highest MPK portfolio and short on stocks in the lowest. Examining firms grouped by MPK helps eliminate firm-specific factors unrelated to MPK that may affect returns and so allows us to hone in on the predictability of excess returns by MPK and follows recent practice in empirical finance, which has generally moved from addressing variation in individual firm returns to returns on portfolios of firms, sorted by factors that are known to predict returns. In our context, however, this is likely to provide only a noisy measure of the true relationship between MPK and risk premia. For example, the main text studies a number of additional reasons why MPK may differ across firms, e.g., the 58 When computing one-period ahead returns, we follow Fama and French (1992) and associate the MPK for fiscal year t with returns from July of year t + 1 to June of year t + 2. Similar timing holds for three-period ahead returns. Value-weighted portfolios yield similar magnitudes, though the standard errors are greater in some specifications since value-weighting the smaller within-industry samples can increase the variance of portfolio returns. In our log-normal model, equal-weighted dispersion is the key object of interest. 50 realization of unanticipated shocks (equation (14), capital adjustment costs (Section 3.2), other frictions/distortions (Section 4.3), etc. Each of these forces influence MPK and thus affect the sorting variable (e.g., the high MPK bin includes firms with a high risk premium, but also firms with a low risk premium but high realization of the idiosyncratic shock or a large distortion). The effects of this additional noise is analogous to that of measurement error in the right-hand side variable of a regression, attenuating the true relationship between the variables. Our twostep strategy in Table 1 is designed in part to address this concern. However, we think it useful to examine whether MPK is associated with stock returns using this simpler approach, with the caveat that the quantitative magnitudes should be interpreted with caution and likely represent a lower bound. The focus of our analysis (and the misallocation literature more broadly) is on withinindustry variation in MPK and so to control for industry effects, we demean firm-level mpk by industry-year and sort firms based on this de-meaned measure.59 For completeness we also present results for total, non industry-adjusted sorts, since the non-adjusted results may be interesting in their own right (discussed more below) and confirm that the link between MPK and stock returns holds at various levels of aggregation. We report within-industry results in Panel A of Table 7. The table reveals a strong relationship between MPK and stock returns – high MPK portfolios tend to earn high excess returns. The first row shows that the difference in contemporaneous returns between high and low MPK firms, i.e., the excess return on the MPK-HML portfolio, is over 8% annually. The second row confirms that this finding does not simply result from the simultaneous response of stock returns and MPK to the realization of unexpected shocks – one-period ahead excess returns are in fact predictable by MPK. The predictable spread on the MPK-HML portfolio is over 2.5% annually. Both the contemporaneous and future MPK-HML spreads are statistically different from zero at the 99% level. The last two rows of Panel A confirm that the results continues to hold when examining returns further in the future and thus exhibits persistence and after de-levering equity returns. Thus, high MPK firms tend to offer high stock returns, both in a realized and an expected sense, suggesting that MPK differences reflect exposure to risk factors for which investors demand compensation in the form of a higher rate of return.60 Panel B of Table 7 reports the “total” results not controlling for industry. Comparing the two panels shows that the relationship between MPK and returns is even stronger when taken 59 There may be heterogeneity across industries on a number of dimensions, for example, in production function coefficients or industry-level exposure to aggregate shocks. 60 Although there are several measurement differences, the results in Table 7 are related to the “profitability premium” documented in Novy-Marx (2013) and others, i.e., high profit-to-capital firms earn high excess returns (both industry and non industry-adjusted). Further, Novy-Marx (2013) finds that the sales-to-assets component of profitability is the most directly related to higher returns (Appendix A.2 in that paper). 51 unconditionally across industries, suggesting that there is indeed an industry-level component of excess returns that is predictable by an industry-level component of MPK. Although we do not explore this finding in more detail, it is reassuring confirmation of the link we are after – firms in industries with high average MPK tend to offer higher returns (in a predictable sense) than firms in low MPK industries, suggesting that industry-level exposures to aggregate risk factors may be important as well. Note that across all the variations reported in Table 7, the within-industry effects are well over half of the total, implying a key role for the within-industry component.61 Table 7: Excess Returns on MPK-Sorted Portfolios Portfolio rte e rt+1 e rt+3 a rt+1 rte e rt+1 e rt+3 a rt+1 Low 6.98 (1.63) 11.10∗∗∗ (2.61) 11.95∗∗∗ (2.99) 6.86∗∗ (2.13) 7.00∗∗ (2.01) 8.60∗∗ (2.48) 9.63∗∗∗ (2.96) 4.64∗ (1.88) 2 3 4 Panel A: Within-Industry 10.59∗∗∗ 12.28∗∗∗ (2.52) (3.05) (3.30) 11.55∗∗∗ 12.71∗∗∗ 12.70∗∗∗ (3.35) (3.75) (3.50) 12.27∗∗∗ 12.04∗∗∗ 12.60∗∗∗ (3.71) (3.75) (3.60) 7.16∗∗∗ 8.04∗∗∗ 8.17∗∗∗ (2.94) (3.37) (3.15) Panel B: Total ∗∗ 9.08 10.67∗∗∗ 12.00∗∗∗ (2.53) (2.93) (3.09) ∗∗∗ ∗∗∗ 12.27 13.48 13.73∗∗∗ (3.47) (3.80) (3.62) 12.43∗∗∗ 12.69∗∗∗ 13.90∗∗∗ (3.69) (3.71) (3.81) 7.53∗∗∗ 8.69∗∗∗ 8.66∗∗∗ (3.07) (3.53) (3.26) 8.91∗∗ High MPK-HML 15.78∗∗∗ (3.73) 13.69∗∗∗ (3.36) 13.82∗∗∗ (3.58) 8.84∗∗∗ (3.04) 8.80∗∗∗ (9.54) 2.59∗∗∗ (2.98) 1.87∗∗ (2.22) 1.97∗∗∗ (2.66) 15.25∗∗∗ (3.71) 13.48∗∗∗ (3.36) 12.99∗∗∗ (3.38) 8.22∗∗∗ (3.02) 8.25∗∗∗ (4.54) 4.87∗∗∗ (2.81) 3.36∗ (1.96) 3.58∗∗∗ (3.05) Notes: This table reports stock market returns for portfolios sorted by mpk. rte denotes equal-weighted contemporaneous annualized monthly excess stock returns (over the risk-free rate) measured in the year of the portfolio formation from January to December e e of year t. rt+1 denotes the analogous future returns, measured from July of year t + 1 to June of year t + 2 and rt+3 from July of a year t + 3 to June of year t + 4. rt+1 denotes equal-weighted unlevered (“asset”) returns from from July of year t + 1 to June of year t + 2, where we use an unlimited liability model to unlever equity returns. Industry adjustment is done by de-meaning mpk by industry-year and sorting portfolios on de-meaned mpk, where industries are defined at the 4-digit SIC code level. t-statistics in parentheses, computed using Newey-West standard errors. Significance levels are denoted by: * p < 0.10, ** p < 0.05, *** p < 0.01. MPK and measures of risk exposure. Next, we directly relate firm MPK to measures of risk exposure using the sensitivity of MPK to aggregate risk factors. To do so, we estimate 61 In unreported results, we have verified that the relationship between MPK and returns continues to hold when we expand the number of portfolios (to 10) and control for size and book-to-market (though they are both correlated with MPK). 52 regressions of the form mpkit+1 = ψ0 + ψβ βit + ζit+1 , (37) where βit is a measure of firm i’s MPK exposure to aggregate risk at time t. Although there are several important measurement concerns in calculating these exposures compared to stock market-based measures – e.g., they are lower frequency, may be more prone to issues of measurement/sampling error (long enough samples of MPK are only available for a subset of firms) and require assumptions about the factor structure of aggregate risk in MPK that is less wellexplored than that in stock returns – it can still be useful to examine the relationship of these exposures to firm-level MPK, keeping in mind these important caveats.62 We calculate measures of MPK exposure using the CAPM and Fama-French models. Specifically, we follow an analogous procedure to (35) and (36), replacing excess stock market returns on the left-hand side of (35) and (36) with mpkit . The first regression yields measures of βM P K , i.e., the exposure of each firm’s MPK to the aggregate risk factors. The second regression combines these exposures into a single value in the multi-factor Fama-French model using the coefficients from cross-sectional Fama and MacBeth (1973) regressions, i.e., as βit,F F = λβit = X λx βit,x , x ∈ M KT, HM L, SM B x P where λx = T1 Tt=1 λxt . We estimate (37) at an annual frequency and lag the right-hand side variable to control for the simultaneous effect of unexpected shocks on contemporaneous measures of beta and MPK. We report the results in columns (1)-(2) in Table 8. The estimates imply that firmlevel MPK is significantly related to the sensitivity of MPK to measures of aggregate risk, i.e., the aggregate market return and the three Fama-French factors.63 In columns (3)-(4), we estimate analogous regressions with the addition of industry-year fixed effects and a set of standard firm-level controls, namely, market capitalization, book-to-market ratio, profitability, and market leverage.64 All of the coefficients remain positive and statistically significant. Thus, the results help confirm a key implication of expression (3): firm-level risk exposures – measured using “MPK betas” – are significantly related to firm-level expected MPK. 62 Our two-stage approach in Section 2.2 helps deal with some of these issues. We report two-way clustered standard errors by firm and industry-year to allow for arbitrary time-series correlations for a given firm and for correlations across firms within an industry at a particular time. These standard errors do not account for the error associated with the generated regressors (betas). As in Guren, McKay, Nakamura, and Steinsson (2018), this requires a bootstrap procedure that clusters only on time but precludes clustering on other dimensions. In unreported results, we follow Guren, McKay, Nakamura, and Steinsson (2018) and perform such a bootstrap. The estimates remain significant across almost all specifications. 64 We describe these series in Appendix B.1. 63 53 Table 8: Regressions of MPK on MPK Risk Exposures βCAP M,M P K (1) 0.065∗∗∗ (5.46) βF F,M P K Observations F.E. Controls 79404 No No (2) 4.005∗∗∗ (9.49) 78920 No No (3) 0.024∗∗∗ (3.80) 72477 Yes Yes (4) 1.097∗∗∗ (4.68) 71990 Yes Yes Notes: This table reports the results of a panel regression of year-ahead mpk regressed on measures of firm mpk exposure to aggregate risk. Each observation is a firm-year. The dataset contains approximately 10,000 unique firms. F.E. denotes the presence of industry-year fixed effects. Standard errors are two-way clustered by firm and industry-year. t-statistics in parentheses. Significance levels are denoted by: * p < 0.10, ** p < 0.05, *** p < 0.01. MPK dispersion and risk premia dispersion. An additional implication of expression (4) is that across groups of firms or segments of the economy, dispersion in expected MPK should be positively related to dispersion in risk permia. We investigate this implication using variation in the dispersion of expected stock market returns and measured risk exposures across industries. Specifically, for each industry in each year, we compute the standard deviation of MPK, σ (mpk), expected stock returns, σ (E [r]), and various measures of beta, σ (β). We then estimate regressions of industry-level MPK dispersion on the dispersion in expected returns and betas, i.e., σ (mpkjt+1 ) = ψ0 + ψ1 σ (xjt ) + ζjt+1 xjt = E [rjt ] , βjt , where j denotes industry. To avoid potential simultaneity biases from the realization of shocks, we lag the independent variables (dispersion in expected returns and betas). Table 9 reports the results of these regressions and verifies that industries with higher dispersion in expected stock returns and risk exposures exhibit greater dispersion in MPK. Column (1) reveals this fact using expected returns calculated from the Fama-French model. Variation in expected return dispersion predicted by the Fama-French model explains over 20% of the variation in MPK dispersion across industry-years. Column (2) regresses MPK dispersion on dispersion in each of the three individual factors – variation in the beta on each factor is significantly related to MPK dispersion. Next, we repeat the exercise using dispersion in MPK betas (described above) as the right-hand side variables. The results in column (3) show that industries with greater dispersion in MPK betas (on each of the FamaFrench factors) exhibit greater dispersion in MPK. Columns (4)-(6) add year fixed effects and a number of controls capturing additional measures of firm heterogeneity within industries – the standard deviations of profitability, size, book-to-market, and market leverage. Across these specifications, measures of within-industry heterogeneity in expected returns and aggregate risk 54 exposures remain positive and significant predictors of within-industry dispersion in MPK.65 Table 9: Industry-Level Dispersion in MPK, Expected Stock Returns and Beta σ(E[r]) (1) 2.71∗∗∗ (30.11) (2) (4) 1.20∗∗∗ (9.82) 0.11∗∗∗ (6.48) 0.14∗∗∗ (11.18) 0.14∗∗∗ (13.72) σ(βM KT ) σ(βHM L ) σ(βSM B ) σ(βCAP M,M P K ) σ(βHM L,M P K ) σ(βSM B,M P K ) Observations R2 Industries Year F.E. Controls (3) 3203 0.221 157 No No 3210 0.265 161 No No (5) (6) 0.08∗∗∗ (3.31) 0.10∗∗∗ (5.61) 0.07∗∗∗ (5.77) 0.01∗∗∗ (8.58) 0.06∗∗∗ (7.96) 0.06∗∗∗ (10.38) 2398 0.200 142 No No 3188 0.261 153 Yes Yes 3194 0.285 156 Yes Yes 0.09∗∗∗ (4.08) 0.06∗∗∗ (4.80) 0.06∗∗∗ (5.70) 2380 0.348 138 Yes Yes Notes: This table reports a panel regression of the dispersion in mpk within industries on lagged measures of dispersion in risk exposure within those industries. An observation is an industry-year. E [r] is the expected return computed from the Fama-French model. β denotes the stock return beta on the Fama-French factors and βM P K the mpk beta on the same factors. t-statistics are in parentheses. Significance levels are denoted by: * p < 0.10, ** p < 0.05, *** p < 0.01. Bootstrapped standard errors for Table 1. Table 1 reports standard errors that are twoway clustered by firm and year, but do not account for the estimation error in the measures of risk exposure used in the first stage, and possibly are limited by the parametric assumptions imposed by the regression specification. To address this concern, we have performed a custom bootstrapping procedure that runs a block-bootstrap three ways on our entire procedure, encompassing (i) our rolling-window estimation of firm risk exposures (betas), (ii) the estimation of firm risk premia from these betas, and (iii) the estimation of the regression of future MPK on firm risk premia. First, we randomly sample those firms in our dataset (with replacement) which report sufficient data to compute MPK and appear for long enough to compute measures of risk exposure. Second, we randomly sample the time-periods to use for our first and second 65 The results are robust to using different asset pricing models to compute betas and expected returns, such as the CAPM and Hou et al. (2015) investment-CAPM models. The relationship is robust to a variety of different controls and industry definitions as well. Finally, the results are qualitatively similar when we use the inter-quartile range instead of the standard deviation as our measure of within-industry dispersion. 55 stage procedures. Third, within each backwards-looking rolling window used to compute the betas, we randomly sample the time periods of observations used (but use the same time periods for each firm).66 We run this bootstrapping procedure 250 times for each specification. We compute the standard deviation of these estimates, adjusted for the sample size of the re-sampled regressions.67 We find that the regression coefficients in Table 1 remain significant, except for those using the CAPM model. The implied t-statistics for specifications (1)-(6) are, respectively, 0.36, 2.29, 5.53, 0.36, 2.53, and 6.00.68 D Baseline Model This appendix provides detailed derivations for the baseline model and analysis. D.1 Solution – No Adjustment Costs The static labor choice solves max eẑit +β̂i xt Kitθ1 Nitθ2 − Wt Nit with the associated first order condition Nit = θ2 eẑit +β̂i xt Kitθ1 Wt 1 ! 1−θ 2 Substituting for the wage with Wt = Xtω and rearranging gives operating profits Πit = Geβi xt +zit Kitθ θ2 where G ≡ (1 − θ2 ) θ21−θ2 , βi = (11) in the text. 1 1−θ2 β̂i − ωθ2 , zit = 66 1 ẑ 1−θ2 it and θ = θ1 , 1−θ2 which is equation This will account for potential correlation of estimation error across firms. Our random sampling of firms and years leads to, on average, fewer observations than in our baseline dataset. We adjust the estimated standard deviation for the lower average number of observations. 68 In the case of the CAPM, the bootstrapping algorithm generates a few extreme outliers that lead to the high standard deviation and low t-statistics. If we were to use the percentiles of the distribution instead, the p-value would be lower than what the t-statistic implies. These extreme outliers do not occur with the Fama-French or q 5 models, which are known to be better at matching the cross-sectional distribution of risk premia. 67 56 The first order and envelope conditions associated with (1) give the Euler equation: θ−1 +1−δ 1 = Et Mt+1 θezit+1 +βi xt+1 GKit+1 θ−1 = (1 − δ) Et [Mt+1 ] + θGKit+1 Et emt+1 +zit+1 +βi xt+1 Substituting for mt+1 and rearranging, h i 1 2 2 Et emt+1 +zit+1 +βi xt+1 = Et elog ρ−γt εt+1 − 2 γt σε +zit+1 +βi xt+1 h i log ρ+ρz zit +εit+1 +βi ρx xt +(βi −γt )εt+1 − 21 γt2 σε2 = Et e 1 2 1 2 2 2 = elog ρ+ρz zit +βi ρx xt + 2 σε̃ + 2 βi σε −βi γt σε and i h 1 2 2 1 2 2 1 2 2 Et [Mt+1 ] = Et elog ρ−γt εt+1 − 2 γt σε = elog ρ+ 2 γt σε − 2 γt σε = ρ so that θ−1 θGKit+1 = 1 − (1 − δ) ρ 1 2 1 2 2 2 elog ρ+ρz zit +βi ρx xt + 2 σε̃ + 2 βi σε −βi γt σε and rearranging and taking logs, kit+1 1 = 1−θ 1 2 1 2 2 2 α̃ + σε̃ + βi σε + ρz zit + βi ρx xt − βi γt σε 2 2 where α̃ = log θ + log G − α α = − log ρ + log (1 − (1 − δ) ρ) = log (rf + δ) Ignoring the variance terms gives equation (12). The realized mpk is given by mpkit+1 = log θ + πit+1 − kit+1 = log θ + log G + zit+1 + βi xt+1 − (1 − θ) kit+1 = log θ + log G + zit+1 + βi xt+1 − α̃ − ρz zit − βi ρx xt + βi γt σε2 = α + εit+1 + βi εt+1 + βi γt σε2 The time t conditional expected mpk is Et [mpkit+1 ] = α + βi γt σε2 57 and the time t and mean cross-sectional variances are, respectively, 2 σE2 t [mpkit+1 ] = σβ2 γt σε2 h i h 2 i 2 = σβ2 γ02 + γ12 σx2 σε2 E σE2 t [mpkit+1 ] = E σβ2 (γ0 + γ1 xt )2 σε2 D.2 Solution – Adjustment Costs With capital adjustment costs, the firm’s investment problem takes the form GXtβi Zt Kitθ − Kit+1 + (1 − δ) Kit − Φ (Iit , Kit ) V (Xt , Zit , Kit ) = max Kit+1 (38) + Et [Mt+1 V (Xt+1 , Zit+1 , Kit+1 )] Policy function. The first order and envelope conditions associated with (38) give the Euler equation: 1+ξ Kit+1 Kit " ! 2 Kit+2 Kit+2 ξ Kit+2 θ−1 −1 = Et Mt+1 Gθezit+1 +βi xt+1 Kit+1 −1 +ξ −1 +1−δ− 2 Kit+1 Kit+1 Kit+1 " !# 2 ξ Kit+2 ξ θ−1 = Et Mt+1 Gθezit+1 +βi xt+1 Kit+1 +1−δ+ − 2 Kit+1 2 In the non-stochastic steady state, MP K Π P R 1 θ−1 1 1 1 +δ−1 = GθK = +δ−1 ⇒ K = ρ Gθ ρ = GK θ ⇒ D = GK θ − δK ρ = D 1−ρ 1 D = 1+ = ⇒ rf = − log ρ P ρ θ−1 Define the investment return: I Rit+1 θ−1 Gθezit+1 +βi xt+1 Kit+1 + 1 − δ + 2ξ = 1 + ξ KKit+1 − 1 it 58 Kit+2 Kit+1 2 − ξ 2 and log-linearizing, I = ρGθK θ−1 (zit+1 + βi xt+1 ) + ρGθ (θ − 1) K θ−1 − ξ (1 + ρ) kit+1 + ρξkit+2 + ξkit rit+1 − log ρ − ρGθ (θ − 1) K θ−1 k where k = log K. Rearranging and suppressing constants yields expression (25). To derive the investment policy function, conjecture it takes the form kit+1 = φ0i + φ1 βi xt + φ2 zit + φ3 kit Then, kit+2 = φ0i (1 + φ3 ) + φ1 βi (ρx + φ3 ) xt + φ2 (ρz + φ3 ) zit + φ23 kit + φ1 βi εt+1 + φ2 εit+1 Substituting into the investment return, I rit+1 = + + + + ρGθ (θ − 1) K θ−1 − ξ (1 − ρφ3 ) φ0i − log ρ − ρGθ (θ − 1) K θ−1 k ρGθK θ−1 ρz + ρGθ (θ − 1) K θ−1 − ξ (1 + ρ) φ2 + ρξ (ρz + φ3 ) φ2 zit ρGθK θ−1 ρx + ρGθ (θ − 1) K θ−1 − ξ (1 + ρ) φ1 + ρξ (ρx + φ3 ) φ1 βi xt ρGθ (θ − 1) K θ−1 − ξ (1 + ρ) φ3 + ρξφ23 + ξ kit ρGθK θ−1 + ρξφ2 εit+1 + ρGθK θ−1 + ρξφ1 βi εt+1 and I rit+1 + mit+1 = + + + + 1 1 ρGθ (θ − 1) K θ−1 − ξ (1 − ρφ3 ) φ0i − ρGθ (θ − 1) K θ−1 k − γ02 σε2 − γ12 σε2 x2t 2 2 ρGθK θ−1 ρz + ρGθ (θ − 1) K θ−1 − ξ (1 + ρ) φ2 + ρξ (ρz + φ3 ) φ2 zit ρGθK θ−1 ρx + ρGθ (θ − 1) K θ−1 − ξ (1 + ρ) φ1 + ρξ (ρx + φ3 ) φ1 βi − γ0 γ1 σε2 xt ρGθ (θ − 1) K θ−1 − ξ (1 + ρ) φ3 + ρξφ23 + ξ kit ρGθK θ−1 + ρξφ2 εit+1 + ρGθK θ−1 + ρξφ1 βi − γ0 − γ1 xt εt+1 59 The Euler equation governing the investment return implies I 1 I 0 = Et rit+1 + mt+1 + var rit+1 + mit+1 2 θ−1 = ρGθ (θ − 1) K − ξ (1 − ρφ3 ) φ0i − ρGθ (θ − 1) K θ−1 k + ρGθK θ−1 ρz + ρGθ (θ − 1) K θ−1 − ξ (1 + ρ) φ2 + ρξ (ρz + φ3 ) φ2 zit + ρGθK θ−1 ρx + ρGθ (θ − 1) K θ−1 − ξ (1 + ρ) φ1 + ρξ (ρx + φ3 ) φ1 − ρGθK θ−1 + ρξφ1 γ1 σε2 βi xt + ρGθ (θ − 1) K θ−1 − ξ (1 + ρ) φ3 + ρξφ23 + ξ kit 2 1 ρGθK θ−1 + ρξφ2 σε̃2 + 2 2 1 + ρGθK θ−1 + ρξφ1 βi2 σε2 − ρGθK θ−1 + ρξφ1 βi γ0 σε2 2 and we can solve for the coefficients from: 0 = + + = = = ρGθ (θ − 1) K θ−1 − ξ (1 − ρφ3 ) φ0i − ρGθ (θ − 1) K θ−1 k 2 1 ρGθK θ−1 + ρξφ2 σε̃2 2 2 1 ρGθK θ−1 + ρξφ1 βi2 σε2 − ρGθK θ−1 + ρξφ1 βi γ0 σε2 2 ρGθK θ−1 ρz + ρGθ (θ − 1) K θ−1 − ξ (1 + ρ) φ2 + ρξ (ρz + φ3 ) φ2 ρGθK θ−1 ρx + ρGθ (θ − 1) K θ−1 − ξ (1 + ρ) φ1 + ρξ (ρx + φ3 ) φ1 − ρGθK θ−1 + ρξφ1 γ1 σε2 ρGθ (θ − 1) K θ−1 − ξ (1 + ρ) φ3 + ρξφ23 + ξ Define ξˆ = ξ ρGθK θ−1 = ξ . 1−ρ(1−δ) 0 = φ1 = φ2 Then, ˆ 2 + ξˆ (θ − 1) − ξˆ (1 + ρ) φ3 + ρξφ 3 (ρx − γ1 σε2 ) φ3 ξˆ (1 − ρρx φ3 + ργ1 σε2 φ3 ) ρz φ3 = ˆ ξ (1 − ρρz φ3 ) φ0i = φ00 − φ01 βi + φ02 βi2 60 where φ00 φ01 φ02 ρGθ (1 − θ) K θ−1 k + 12 ρGθK θ−1 + ρξφ2 = ρGθ (1 − θ) K θ−1 + ξ (1 − ρφ3 ) γ0 σε2 φ3 = ξˆ (1 − ρφ3 ) 1 − ρρx φ3 + ργ1 σε2 φ3 2 σε̃2 2 ρGθK θ−1 ρξφ1 + 21 (ρξφ1 )2 + 21 ρGθK θ−1 = σε2 ρGθ (1 − θ) K θ−1 + ξ (1 − ρφ3 ) 1 Note that φξ̂3 goes to 1−θ as ξˆ goes to zero and zero as ξˆ goes to infinity. Again ignoring variance terms, and defining φ4 = γφ001 , the policy function is σ2 ε kit+1 = φ1 βi xt + φ2 zit + φ3 kit − φ4 βi γ0 σε2 + constant which is equation (27) in the text. Persistent MPK Dispersion. To derive expression (29), take the unconditional expectation of the policy function to obtain φ4 βi γ0 σε2 E [kit+1 ] = − 1 − φ3 and thus the unconditional expected mpk as E [mpkit+1 ] = (θ − 1) E [kit+1 ] + constant 1 (1 − θ) φ3 β γ σ 2 + constant = 2) i 0 ε ˆ 1 − ρφ (ρ − γ σ 3 x 1 ε ξ (1 − ρφ3 ) (1 − φ3 ) where we have substituted using the definition of φ4 . Lastly, we can use the definition of φ3 to show that the first fraction is equal to one and thus, E [mpkit+1 ] = D.3 1 βi γ0 σε2 + constant 2 1 − ρφ3 (ρx − γ1 σε ) Aggregation The first order condition on labor gives β̂i xt +ẑit Nit = θ2 e Wt 61 Kitθ1 1 ! 1−θ 2 and substituting for the wage, 1 1−θ 2 Nit = θ2 e(β̂i −ω)xt +ẑit Kitθ1 Labor market clearing gives: Z Nt = 1 1−θ2 Nit di = θ2 e Z 1 − 1−θ ωxt 1 e 1−θ2 2 β̂i xt +zit so that θ2 1−θ2 θ2 e 2 !θ2 Nt θ 2 ωx − 1−θ t = R 1 e 1−θ2 Kitθ di β̂i xt +zit Kitθ di Then, θ2 Yit = eβ̂i xt +ẑit Kitθ1 Nitθ2 = θ21−θ2 e 1 = R e 1−θ2 e β̂i xt +zit 1 β̂ x +zit 1−θ2 i t Kitθ θ 2 ωx − 1−θ t 2 1 e 1−θ2 β̂i xt +zit Kitθ θ Kitθ di θ2 Nt 2 By definition, 1 M P Kit = R β̂i xt +zit Kitθ−1 θ θ2 Nt 2 1 β̂ x +z e 1−θ2 i t it Kitθ di θe 1−θ2 and rearranging, Kit = θe 1 β̂ x +zit 1−θ2 i t 1 ! 1−θ M P Kit θ2 ! 1−θ Nt R e 1 β̂ x +zit 1−θ2 i t Kitθ di Capital market clearing gives Z Kt = Nt 1 Kit di = θ 1−θ θ2 ! 1−θ Z R e 1 β̂ x +zit 1−θ2 i t so that Kitθ = R e 1 1 1 β̂i xt + 1−θ zit Kitθ di θ 1 − 1−θ M P Kit Kt 1 1 1 − 1−θ β̂ x + z 1−θ i t 1−θ it M P Kit di 1 1 β̂ x + 1 z 1−θ2 1−θ i t 1−θ it e 1−θ2 1 e 1−θ2 1−θ 62 − 1 M P Kit 1−θ di and substituting into the expression for Yit , e !θ e 1−θ2 R Yit = R 1 e 1−θ2 β̂i xt +zit 1 β̂ x + 1 z − 1 1−θ i t 1−θ it M P K 1−θ it 1 1 β̂ x + 1 z − 1 t 1−θ it e 1−θ2 1−θ i M P Kit 1−θ 1 1 β̂ x +zit 1−θ2 i t 1 β̂ x + 1 z − 1 1−θ i t 1−θ it M P K 1−θ it 1 1 β̂ x + 1 z − 1 t 1−θ it e 1−θ2 1−θ i M P Kit 1−θ Kt di θ 1 e 1−θ2 R θ2 Nt 2 !θ Kt di di 1 β̂ x + 1 z − θ 1−θ i t 1−θ it M P K 1−θ it !θ 1 1 β̂ x + 1 z − 1 t 1−θ it e 1−θ2 1−θ i M P Kit 1−θ di 1 e 1−θ2 R θ θ θ2 Kt 1 Nt 2 = 1 β̂ x + 1 z − θ 1−θ i t 1−θ it M P K 1−θ di it !θ 1 β̂ x + 1 z 1 − 1 t 1−θ it e 1−θ2 1−θ i M P Kit 1−θ di 1 R R e 1−θ2 Aggregate output is then Z Yt = Yit di = At Ktθ1 Ntθ2 where 1−θ2 1 1 1 − θ β̂i xt + 1−θ zit R e 1−θ 2 1−θ M P Kit 1−θ di At = θ 1 R 1 1 β̂i xt + 1 zit − 1−θ e 1−θ2 1−θ M P Kit 1−θ di Taking logs, at Z Z θ 1 1 1 1 1 − 1−θ − 1−θ β̂ x + 1 z β̂ x + 1 z 1−θ2 1−θ i t 1−θ it 1−θ2 1−θ i t 1−θ it M P Kit di − θ log e M P Kit di = (1 − θ2 ) log e The first expression in braces is equal to 1 1 ¯ θ 1 β̂xt − m̄pk + 1 − θ 1 − θ2 1−θ 2 − 1 1−θ 2 1 θ σmpk,β̂i xt +zit 2 (1 − θ) 1 − θ2 63 1 1 − θ2 2 ! x2t σβ̂2 + σz2 1 + 2 θ 1−θ 2 2 σmpk and the second to 1 ¯ θ 1 θ β̂xt − m̄pk + θ 1 − θ 1 − θ2 1−θ 2 − 1 1−θ 2 1 1 − θ2 2 ! x2t σβ̂2 + σz2 1 + θ 2 1 1−θ 2 2 σmpk θ 1 σmpk,β̂i xt +zit 2 (1 − θ) 1 − θ2 and combining (and using σβ = gives 1 θ 1 1 2 x2t σβ2 + σz2 − σmpk 21−θ 21−θ 1 θ 2 = a∗t − (1 − θ2 ) σmpk 2 1−θ 1 θ (1 − θ 1 2) 2 = a∗t − σ 2 1 − θ1 − θ2 mpk ¯ = β̂xt + (1 − θ2 ) at D.4 1 σ ) 1−θ2 β̂ Stock Market Returns We derive stock market returns in the environment with adjustment costs. This nests the simpler case without them when ξ = 0. Dividends are equal to zit+1 +βi xt+1 Dit+1 = e θ Kit+1 ξ − Kit+2 + (1 − δ) Kit+1 − 2 2 Kit+2 − 1 Kit+1 Kit+1 and log-linearizing, dit+1 Π Π K Π K K = (zit+1 + βi xt+1 ) + θ + (1 − δ) kit+1 − kit+2 + log D − θ − δ k D D D D D D where k = log K. Substituting for kit+1 and kit+2 from Appendix D.2 and rearranging, dit+1 = A0i + Ã1 zit + A1 βi xt + Ã2 εit+1 + A2 βi εt+1 + A3 kit 64 where A0i = A1 = Ã1 = A2 = Ã2 = A3 = K K Π (k − φ0i ) − φ0i φ3 log D − θ − δ D D D Π Π K ρx + θ + (1 − δ − ρx − φ3 ) φ1 D D D Π Π K ρz + θ + (1 − δ − ρz − φ3 ) φ2 D D D Π K − φ1 D D Π K − φ2 D D Π K θ + (1 − δ − φ3 ) φ3 D D By definition, returns are equal to Rit+1 = Dit+1 + Pit+1 Pit and log-linearizing, rit+1 = ρpit+1 + (1 − ρ) dit+1 − pit − log ρ + (1 − ρ) log P D Conjecture the stock price takes the form pit = c0i + c1 βi xt + c2 zit + c3 kit Then, rit+1 P = − log ρ + (1 − ρ) log + A0i − c0i + ρc3 φ0i D + (ρρz − 1) c2 + ρc3 φ2 + (1 − ρ) Ã1 zit + ((ρρx − 1) c1 + ρc3 φ1 + (1 − ρ) A1 ) βi xt + ((ρφ3 − 1) c3 + (1 − ρ) A3 ) kit + ρc2 + (1 − ρ) Ã2 εit+1 + (ρc1 + (1 − ρ) A2 ) βi εt+1 and the (log) excess return is the (negative of the) conditional covariance with the SDF: e log Et Rit+1 = (ρc1 + (1 − ρ) A2 ) βi γt σε2 65 To solve for the coefficients, use the Euler equation. First, rit+1 + mit+1 P 1 = (1 − ρ) log + A0i − c0i + ρc3 φ0i − γ02 σε2 D 2 + (ρρz − 1) c2 + ρc3 φ2 + (1 − ρ) Ã1 zit + ((ρρx − 1) c1 + ρc3 φ1 + (1 − ρ) A1 ) βi − γ0 γ1 σε2 xt + ((ρφ3 − 1) c3 + (1 − ρ) A3 ) kit 1 2 2 2 − γ1 σε xt 2 + ρc2 + (1 − ρ) Ã2 εit+1 + ((ρc1 + (1 − ρ) A2 ) βi − γ0 − γ1 xt ) εt+1 The Euler equation implies 1 0 = Et [rit+1 + mit+1 ] + var (rit+1 + mit+1 ) 2 P 1 = (1 − ρ) log + A0i − c0i + ρc3 φ0i + (ρc1 + (1 − ρ) A2 )2 βi2 σε2 − (ρc1 + (1 − ρ) A2 ) βi γ0 σε2 D 2 2 1 ρc2 + (1 − ρ) Ã2 σε̃2 + 2 + (ρρz − 1) c2 + ρc3 φ2 + (1 − ρ) Ã1 zit + (ρρx − 1) c1 + ρc3 φ1 + (1 − ρ) A1 − (ρc1 + (1 − ρ) A2 ) γ1 σε2 βi xt + ((ρφ3 − 1) c3 + (1 − ρ) A3 ) kit and so by undetermined coefficients, P 1 0 = (1 − ρ) log + A0i − c0i + ρc3 φ0i + (ρc1 + (1 − ρ) A2 )2 βi2 σε2 − (ρc1 + (1 − ρ) A2 ) βi γ0 σε2 D 2 2 1 + ρc2 + (1 − ρ) Ã2 σε̃2 2 = (ρρz − 1) c2 + ρc3 φ2 + (1 − ρ) Ã1 = (ρρx − 1) c1 + ρc3 φ1 + (1 − ρ) A1 − (ρc1 + (1 − ρ) A2 ) γ1 σε2 = (ρφ3 − 1) c3 + (1 − ρ) A3 66 or (1 − ρ) A3 1 − ρφ3 ρc3 φ2 + (1 − ρ) Ã1 = 1 − ρρz ρc3 φ1 + (1 − ρ) (A1 − A2 γ1 σε2 ) = 1 − ρρx + ργ1 σε2 c3 = c2 c1 Substituting for c1 we can solve for e ρ2 c3 φ1 + (1 − ρ) (ρA1 + (1 − ρρx ) A2 ) = βi γt σε2 log Et Rit+1 1 − ρρx + ρσε2 γ1 Solving for ρA1 + (1 − ρρx ) A2 = 1 ρ + δ − 1 − ρθφ1 φ3 1 ρ + δ (1 − θ) − 1 ρ2 (1 − ρ) φ1 φ3 = θ 1 − ρφ3 2 ρ c3 φ1 1 ρ 1 ρ − φ3 + δ (1 − θ) − 1 substituting into the return equation and simplifying, we obtain e log Et Rit+1 = ψβi γt σε2 where ψ= 1 ρ 1 ρ +δ−1 1−ρ + δ (1 − θ) − 1 1 − ρρx + ργ1 σε2 which is equation (21) in the text. The Sharpe ratio is the ratio of expected excess returns to the conditional standard deviation of the return: ψβi γt σε2 r SRit = 2 ρc2 + (1 − ρ) Ã2 We can solve for ρc2 + (1 − ρ) Ã2 = 1 ρ 1 ρ σε̃2 + ψ 2 βi2 σε2 +δ−1 1−ρ + δ (1 − θ) − 1 1 − ρρz and substituting and rearranging gives the expression in footnote 29. For a perfectly diversified portfolio (i.e., the integral over individual returns) idiosyncratic shocks cancel, i.e., σε̃2 = 0 and SRmt = γt σε . 67 D.5 Autocorrelation of Investment To derive the autocorrelation of investment, define net investment as ∆kit+1 = kit+1 − kit . We use the following: cov (∆zit , zit ) = cov ((ρz − 1) zit−1 + εit .ρz zit−1 + εit ) = ρz (ρz − 1) σz2 + σε̃2 1 σε̃2 = 1 + ρz cov (∆kit , zit ) = cov (∆kit , ρz zit−1 + εit ) = ρz cov (∆kit , zit−1 ) = ρz cov (φ1 βi ∆xt−1 + φ2 ∆zit−1 + φ3 ∆kit−1 , zit−1 ) = ρz (cov (φ2 ∆zit−1 , zit−1 ) + φ3 cov (∆kit−1 , zit−1 )) 1 σ 2 + ρz φ3 cov (∆kit−1 , zit−1 ) = ρz φ 2 1 + ρz ε̃ so that E [cov (∆kit , zit )] = ρz φ2 σε̃2 1 + ρz 1 − φ3 ρz Next, cov (∆kit+1 , ∆zit+1 ) = cov (φ1 βi ∆xt + φ2 ∆zit + φ3 ∆kit . (ρz − 1) zit + εit+1 ) = φ2 (ρz − 1) cov (∆zit , zit ) + φ3 (ρz − 1) cov (∆kit , zit ) 1 ρz φ3 (ρz − 1) φ2 σε̃2 = φ2 (ρz − 1) σε̃2 + 1 + ρz 1 + ρz 1 − φ3 ρz 2 ρz − 1 φ2 σε̃ = 1 + ρz 1 − φ3 ρz Similar steps give cov (∆kit+1 , ∆xt+1 ) = 68 ρx − 1 φ1 βi σε2 1 + ρx 1 − φ3 ρx Combining these gives the variance of investment: 2 2 = φ21 βi2 var (∆xt ) + φ22 var (∆zit ) + φ23 σ∆k σ∆k + 2φ1 φ3 βi cov (∆xt , ∆kit ) + 2φ2 φ3 cov (∆zit , ∆kit ) 2 2 2 = φ21 βi2 σε2 + φ22 σε̃2 + φ23 σ∆k 1 + ρx 1 + ρz 2 2 2 2φ1 φ3 βi σε ρx − 1 2φ22 φ3 σε̃2 ρz − 1 + + 1 − φ3 ρx 1 + ρx 1 − φ3 ρz 1 + ρz 2 2 φ3 (ρx − 1) φ3 (ρz − 1) 2 σε̃ 2 2 2 2 σε 1+ + 2φ2 1+ = φ3 σ∆k + 2φ1 βi 1 + ρx 1 − φ 3 ρx 1 + ρz 1 − φ3 ρz 2 1 1 1 1 = φ21 βi2 σε2 + φ22 σε̃2 1 + φ3 1 + ρx 1 − φ3 ρx 1 + ρz 1 − φ3 ρz Next, cov (∆kit+1 , ∆kit ) = cov (φ1 βi ∆xt + φ2 ∆zit + φ3 ∆kit , ∆kit ) 2 = φ1 βi cov (∆xt , ∆kit ) + φ2 cov (∆zit , ∆kit ) + φ3 σ∆k 1 1 ρx − 1 ρz − 1 2 = φ21 βi2 σε2 + φ22 σε̃2 + φ3 σ∆k 1 + ρx 1 − φ3 ρx 1 + ρz 1 − φ3 ρz and the autocorrelation is: 1 1 2 2 2 ρx −1 2 2 ρz −1 1 + φ3 φ1 βi σε 1+ρx 1−φ3 ρx + φ2 σε̃ 1+ρz 1−φ3 ρz corr (∆kit+1 , ∆kit ) = φ3 + 1 1 1 1 2 φ21 βi2 σε2 1+ρ + φ22 σε̃2 1+ρ x 1−φ3 ρx z 1−φ3 ρz (39) Notice that this approaches corr (∆kit+1 , ∆kit ) = φ3 + (1 − φ3 ) ρx − 1 2 as ρz and ρx become close. Further, in the case both shocks follow a random walk, the autocorrelation is simply equal to φ3 . E Numerical Procedure Our numerical approach to parameterize the model is as follows. To accurately capture the properties of the time-varying risk premium, we solve for returns numerically using a fourthorder approximation in Dynare++. For a given set of the parameters γ0 , γ1 , ξ and σβ2 , we solve the model for a wide grid of beta-types centered around the mean beta. We use an 11 point grid ranging from -3 to 7 (the results are not overly sensitive to the width of the grid). We 69 simulate a time series of excess returns for a large number of firms of each type, which results in a large panel of excess returns. Averaging returns across these firms in each time period yields a series for the market excess return. We can then compute the mean and standard deviation (i.e., Sharpe ratio) of the market return. Next, we compute the expected return h for each ibeta-type in each time period directly as the it+1 and then average over the time periods to conditional expectation Et [Rit+1 ] = Et Dit+1P+P it obtain the average expected return for firms of each type. We then use these values to calculate 2 the dispersion in expected returns, σEr , interpolating for values of β that are not on the grid. We use a simulated investment series to calculate the autocorrelation of investment. Finally, we find the set of the four parameters, γ0 , γ1 , σβ2 and ξ that make the simulated moments consistent with the empirical ones, i.e., (i) market excess return, (ii) market Sharpe ratio, (iii) cross-sectional dispersion in expected returns and (iv) the autocorrelation of investment. As shown in column (1) of the bottom panel of Table 4, the simulated moments are quite close to their empirical counterparts. F Extensions Our baseline framework in Section 3 features (i) a single source of aggregate risk and (ii) a tight connection between financial market conditions and the “real” side of the economy – indeed, the state of technology determined both the common component of firm-level productivities and the price of risk simultaneously. In this appendix, we generalize that setup to allow for (i) multiple risk factors and (ii) more flexible formulations of the determinants of financial conditions. Although empirically disciplining the additional factors added here may be challenging, we demonstrate that the same insights from the baseline analysis go through. We also study versions of the model where heterogeneity in risk premia stem from “alphas” or “mis-pricing” in addition to betas and from differences across firms in exposure to capital price shocks, rather than productivity shocks. and show that our main results continue to hold. F.1 Multifactor Model There are J aggregate risk factors in the economy. Firms have heterogeneous loadings on these factors, so that the profit function (in logs) takes the form πit = βi xt + zit + θkit 70 (40) where βi is a vector of factor loadings of firm i, e.g., the j-th element of βi is the loading of firm i profits on factor j, and xt is the vector of factor realizations at time t, i.e., 0 β1i β2i βi = .. . 0 x1t x2t xt = .. . βJi xJt Each factor, indexed by j, follows an AR(1) process xjt+1 = ρj xjt + εjt+1 , εjt+1 ∼ N 0, σε2j (41) where the innovations are potentially correlated across factors. Denote by Σf the covariance matrix of factor innovations, i.e., σε21 σε1 ,ε2 · · · σε1 ,εJ σε2 ,ε1 σε22 · · · σε2 ,εJ Σf = . .. .. . . . . σεJ ,ε1 σεJ ,ε2 · · · σε2J The idiosyncratic component of firm productivity follows zit+1 = ρz zit + εit+1 , εit+1 ∼ N 0, σε̃2 (42) The stochastic discount factor takes the form 1 mt+1 = log ρ − γεt+1 − γΣf γ 0 2 (43) where γ is a vector of factor exposures, e.g., element γj captures the exposure of the SDF to the j-th factor, and εt+1 is the vector of innovations in each factor, i.e., 0 γ1 γ2 γ= .. . εt+1 γJ ε1t+1 ε2t+1 = .. . εJt+1 For purposes of illustration, we assume γ is constant through time and there are no adjustment costs, although these assumptions are easily relaxed. Expressions (40), (41), (42) and (43) are 71 simple extensions of (11), (9) and (10). Following a similar derivation as Appendix D.1, we can derive the realized mpk: mpkit+1 = α + εit+1 + βi εt+1 + βi Σf γ 0 where βi and εt+1 denote vectors of factor loadings and shocks. The expected mpk and its cross-sectional dispersion are given by Et [mpkit+1 ] = α + βi Σf γ 0 , σE2 t [mpk] = γΣ0f Σβ Σf γ 0 where Σβ is the covariance matrix of factor loadings across firms, i.e., σβ21 σβ1 ,β2 · · · σβ1 ,βJ σβ2 ,β1 σβ22 · · · σβ2 ,βJ Σβ = . .. .. . . . . σβJ ,β1 σβJ ,β2 · · · σβ2J This is the natural analog of expression (16): (i) expected mpk is determined by the firm’s exposure to (all) the aggregate risk factors in the economy and the risk prices of those factors, and (ii) mpk dispersion is a function of the dispersion in those exposures across firms as captured by Σβ . Similar steps as Appendix D.4 gives the following (approximate) expression for expected excess stock returns and the cross-sectional dispersion in expected returns: e Erit+1 = βi ψΣf γ 0 , 2 0 0 0 σEr e = γΣf ψ Σβ ψΣf γ t where ψ is a diagonal matrix with ψjj = 1 ρ 1 ρ +δ−1 1−ρ + (1 − θ) δ − 1 1 − ρρj where ρj denotes the persistence of factor j. These are the analogs of expressions (21) and (22) – expected returns depend on factor exposures and the risk prices of those factors. Expected return dispersion depends on the dispersion in those exposures, here captured by Σβ . Thus, the same insights from the single factor model go through – dispersion in Empk and expected returns are both determined by variation in exposures to the set of aggregate factors and hence, there is a tight relationship between the two. To quantify the impact of these factors on mpk dispersion, however, we would need to know all the primitives governing the dynamics 72 of the factors, e.g., the vector of persistences ρ and the covariance matrix Σf , and exposures, i.e., the exposures of the SDF, γ, and the vectors of firm loadings, Σβ . This would likely entail taking a stand on the nature of each factor, computing their properties from the data and calibrating/estimating the γ vector and the covariance matrix of firm exposures, Σβ . F.2 Financial Shocks Our baseline model tightly linked financial conditions, for example, the price of risk, to macroeconomic conditions, i.e., the state of aggregate technology. However, financial conditions may not co-move one-for-one with the “real” business cycle. Here, we extend the setup to include pure financial shocks. The stochastic discount factor takes the form 1 mt+1 = log ρ − γt εt+1 − γt2 σε2 2 γt = γ0 + γf ft , (44) where εf ∼ N 0, σε2f . ft+1 = ρf ft + εf , In this formulation, ft denotes the time-varying state of financial conditions, which is now disconnected from the state of aggregate technology. These financial factors may be correlated with real conditions, xt , but need not be perfectly so. Thus, there is scope for changes in financial conditions, independent of those in real conditions, to affect the price of risk and through this channel, the allocation of capital.69 Note the difference between this setup and the one in Section F.1 – here, the financial factor, ft , does not directly enter the profit function of the firm, it only affects the price of risk. Thus, it is a shock purely to financial market conditions. In contrast, the factors considered in Section F.1 directly affected firm profitability. Keeping the remainder of the environment the same as Section 3, we can derive exactly the same expressions for expected mpk and its cross-sectional variance, i.e., Et [mpkit+1 ] = α + βi γt σε2 , σE2 t [mpkit+1 ] = σβ2 γt σε2 2 , where now γt is a function of financial market conditions. When credit market conditions tighten (i.e., when ft is small/negative since γf < 0), γt is high and mpk dispersion will rise. Just as in Section 3, the conditional expectation of one-period ahead TFP is given by 2 1 θ1 (1 − θ2 ) 2 σβ γt σε2 Et [at+1 ] = Et a∗t+1 − 2 1 − θ1 + θ2 69 Our baseline model is the nested case where γf = γ1 and ft and xt are perfectly correlated. 73 (45) which illustrates the effects of a deterioration in financial conditions on macroeconomic performance – when credit market conditions tighten and risk premia rise (i.e., ft falls), the resulting increase in mpk dispersion leads to a fall in aggregate TFP. Finally, the average long-run level of Empk dispersion and aggregate TFP are given by 2 2 2 2 2 2 2 E σEmpkt = σβ γ0 + γf σf σεf , 2 2 1 θ1 (1 − θ2 ) 2 2 2 2 ā = a − σ γ + γf σf σεf , 2 1 − θ1 − θ2 β 0 ∗ σε2 where σf2 = 1−ρf2 . The expressions reveal a tight connection between financial conditions and f long-run performance of the economy – higher financial volatility (σε2f ), even independent of the state of the macroeconomy, induces greater persistent MPK dispersion and depresses the average level of achieved productivity. G Other Distortions With other distortions, the derivations are similar to those in Appendix D.1. The Euler equation is given by θ−1 1 = Et Mt+1 θeτit+1 +zit+1 +βi xt+1 GKit+1 +1−δ θ−1 = (1 − δ) Et [Mt+1 ] + θGKit+1 Et emt+1 +τit+1 +zit+1 +βi xt+1 Idiosyncratic distortions. Substituting for mt+1 and τit+1 and rearranging, i h 1 2 2 Et emt+1 +τit+1 +zit+1 +βi xt+1 = Et elog ρ−γt εt+1 − 2 γt σε −ν1 zit+1 −ηit+1 +zit+1 +βi xt+1 i h 1 2 2 = Et elog ρ+(1−ν)ρz zit +(1−ν1 )εit+1 +βi ρx xt +(βi −γt )εt+1 − 2 γt σε −ηit+1 1 2 2 1 2 2 σε̃ + 2 βi σε −βi γt σε2 −ηit+1 = elog ρ+(1−ν1 )ρz zit +βi ρx xt + 2 (1−ν1 ) so that θ−1 θGKit+1 = 1 − (1 − δ) ρ 1 2 2 1 2 2 σε̃ + 2 βi σε −βi γt σε2 −ηit+1 elog ρ+(1−ν1 )ρz zit +βi ρx xt + 2 (1−ν1 ) and rearranging and taking logs, kit+1 1 = 1−θ 1 1 2 2 2 2 2 α̃ + (1 − ν1 ) σε̃ + βi σε + (1 − ν1 ) ρz zit + βi ρx xt − βi γt σε − ηit+1 2 2 where α̃ and α are as defined in Appendix D.1. 74 The realized mpk is given by (ignoring the variance terms) mpkit+1 = log θ + πit+1 − kit+1 = log θ + log G + zit+1 + βi xt+1 − (1 − θ) kit+1 = log θ + log G + zit+1 + βi xt+1 − α̃ − (1 − ν1 ) ρz zit − βi ρx xt + βi γt σε2 + ηit+1 = α + εit+1 + βi εt+1 + ν1 ρz zit + βi γt σε2 + ηit+1 The conditional expected mpk is Et [mpkit+1 ] = α + ν1 ρz zit + βi γt σε2 + ηit+1 and the cross-sectional variance is σE2 t [mpkit+1 ] = (ν1 ρz )2 σz2 + ση2 + γt σε2 2 σβ2 (46) Deriving stock returns follows closely the steps in Appendix D.4. Dividends are equal to τit+1 +zit+1 +βi xt+1 Dit+1 = e θ Kit+1 ξ − Kit+2 + (1 − δ) Kit+1 − 2 2 Kit+2 − 1 Kit+1 Kit+1 and log-linearizing, dit+1 Π Π K K Π K = (τit+1 + zit+1 + βi xt+1 ) + θ + (1 − δ) kit+1 − kit+2 + log D − θ − δ k D D D D D D where k = log K. Substituting for kit+1 and kit+2 from above, dit+1 = A0 + Ã1 zit + A1 βi xt + Ã2 εit+1 + A2 βi εt+1 + A3 ηit+1 + A4 ηit+2 75 where K α̃ Π k− log D − θ − δ D D 1−θ 1 Π K 1 Π K + (1 − δ − ρx ) ρx − θ + (1 − δ − ρx ) γ1 σε2 1−θ D D 1−θ D D K 1 − ν1 Π + (1 − δ − ρz ) ρz 1−θ D D Π 1 K 1 K − ρx + γ1 σε2 D 1 − θ D 1 − θ D Π 1 K − (1 − ν1 ) ρz D 1−θD 1 Π K − θ + (1 − δ) 1−θ D D 1 K 1−θD A0 = A1 = Ã1 = A2 = Ã2 = A3 = A4 = Using the log-linearized return equation, rit+1 = ρpit+1 + (1 − ρ) dit+1 − pit − log ρ + (1 − ρ) log P D and conjecturing the stock price takes the form pit = c0i + c1 βi xt + c2 zit + c3 ηit+1 gives rit+1 P = − log ρ + (1 − ρ) log + A0 − c0 D + (ρρz − 1) c2 + (1 − ρ) Ã1 zit + ((ρρx − 1) c1 + (1 − ρ) A1 ) βi xt + ρc2 + (1 − ρ) Ã2 εit+1 + (ρc1 + (1 − ρ) A2 ) βi εt+1 + (ρc3 + (1 − ρ) A4 ) ηit+2 + ((1 − ρ) A3 − c3 ) ηit+1 The (log) excess return is the (negative of the) conditional covariance with the SDF: e log Et Rit+1 = (ρc1 + (1 − ρ) A2 ) βi γt σε2 A2 is independent of ν1 and η. Following the same steps as in Appendix D.4, it is easily verified that c1 is independent of these terms as well. Thus, expected returns are independent 76 of distortions. Aggregate distortions. Consider the first formulation, i.e., τit+1 = −ν1 zit+1 − ν2 xt+1 − ηit+1 Similar steps as above give expression (46). Dispersion in expected stock market returns are similarly unaffected. Next, consider the second formulation: τit+1 = −ν1 zit+1 − ν3 βi xt+1 − ηit+1 In this case, similar steps as above give the conditional expected mpk as Et [mpkit+1 ] = α + ν1 ρz zit + ν3 βi ρx xt + (1 − ν3 ) βi γt σε2 + ηit+1 and expected excess stock market returns as e log Et Rit+1 = (1 − ν3 ) ψβi γt σε2 where ψ is as defined in expression (21). In other words, the risk-premium effect on expected mpk, as well as expected returns, are both scaled by a factor 1 − ν3 . The mean level of expected mpk and return dispersion are, respectively, h i = ση2 + (ν1 ρz )2 σz2 + (ν3 ρx )2 σx2 σβ2 2 2 + (1 − ν3 ) σε2 γ0 + γ12 σx2 σβ2 + 2ν3 (1 − ν3 ) ρx σx2 γ1 σε2 σβ2 h i 2 2 2 2 2 2 2 E σlog = (1 − ν ) ψσ γ + γ σ σβ 3 e ε 0 1 x Et [Rit+1 ] E σE2 t [mpkit+1 ] The last two terms of the first equation capture the mpk effects of risk premia. The last term there is new and does not have a counterpart in the second equation – in other words, using dispersion in expected returns would give the second to last term, as usual, but not the last. If ν3 < 0, it is straightforward to verify that that term is positive (recall that γ1 is negative). Then, we may be understating risk premium effects. If ν3 > 0, the last terms is negative and we may be overstating them. In this latter case, we can obtain an upper bound on the extent of the potential bias as follows: holding the other parameters fixed, the term is most negative for ν3 = 0.5. Using this value, along with the estimated values of the other parameters, yields a value of the bias that 77 is at most about 0.03. H Robustness – Productivity Betas In this appendix, we investigate the potential effects of (i) mis-measurement of firm-level capital and (ii) unobserved heterogeneity in θ on our estimates of productivity betas in Section 4.4. First, to see the effects of mis-measured capital or measurement error, assume that the measured capital stock is k̂it = kit + eit , where kit is true capital and eit the mis-measurement. Measured firm-level productivity growth is then equal to ∆zit +βi ∆xt −θ∆eit . Regressing this on measures of aggregate productivity, i.e., ∆xt , it is straightforward to see that the estimated β’s would be unaffected so long as changes in mis-measurement at the firm-level (∆eit ) are uncorrelated with the business cycle, which may be a reasonable conjecture. Put another way, mis-measured capital in this analysis leads to measurement errors in the dependent variable, which, under relatively mild conditions, are innocuous.70 How about unobserved heterogeneity in θ? It turns out this will have small effects as well. To see this, let θi denote the true firm-specific parameter and θ our assumed common value. Measured firm-level productivity growth is then equal to ∆zit + βi ∆x − (θ − θi ) ∆kit it ,∆xt ) , where βi = and regressing this on aggregate productivity growth gives βi − (θ−θi )cov(∆k var(∆xt ) β̂i −θ2i ω 1−θ2i is the effective true β (see Section 5 and Appendix I for a further discussion of this expression). The second term represents the potential bias, which depends on the covariance between investment and changes in aggregate productivity. How large is this covariance? As one example, consider the case with no adjustment costs and a constant price of risk. We can use the firm’s optimality condition to analytically characterize the covariance, which gives the θ−θi it ,∆xt ) βi ρx (1 − ρx ). This term turns out to be negligible. = 12 1−θ bias term to be − (θ−θi )cov(∆k var(∆xt ) i Intuitively, because of time-to-build, investment in period t + 1 capital is chosen in period t, before the innovation in productivity is realized. Because of this, the covariance between changes in capital and contemporaneous productivity is quite small and is only non-zero due to mean reversion in the AR(1) process (indeed, if productivity follows a random walk or is iid, i.e., ρx = 1 or ρx = 0, the bias term is zero).71 To verify this result quantitatively, we have simulated data under the extreme case where heterogeneity in θi is the only source of beta dispersion (we use the distribution of θ described in Appendix I). As described there, the true 70 Moreover, a non-zero correlation between ∆eit and ∆xt is not itself sufficient to bias the estimates of beta it ,∆xt ) dispersion. In the case of a non-zero correlation, the regression yields βi − θ cov(∆e . Thus, if the stochastic var(∆xt ) process on eit is common across firms, this will add a constant bias to the beta estimates, but will not affect our estimates of dispersion. 71 Industry-level heterogeneity is a special case where θ varies across industries but not across firms within an industry. 78 standard deviation of beta is 1.35; the biased estimate is 1.38. Although analytic expressions are not available in the full model with adjustment costs and time-varying risk, we have simulated this case as well – the biased estimate remains extremely close to the truth, 1.39. In sum, because the productivity betas are estimated off of covariances, they are extremely robust to concerns of both measurement of capital and unobserved parameter heterogeneity. I The Sources of Betas Heterogeneous technologies. With heterogeneity in input elasticities, the production function for firm i is Yit = Xtβ̂i Ẑit Kitθ1i Nitθ2i (47) In this case, we must make a distinction between mpk and the average product of capital, apk = yit − kit , which is the object we measure in the data. With common parameters, these are proportional. With parameter heterogeneity, they are not. Following similar steps as in the baseline analysis, we can derive apkit+1 = − log θ1i + εit+1 + βi εt+1 + βi γt σε2 + constant (48) where βi ≡ β̂i − θ2i ω 1 − θ2i In other words, an expression analogous to (14) holds, with two differences: first, variation in capital elasticities, θ1i , will directly lead to apk dispersion through the first term in (48). Second, the effective beta is now a combination of the direct sensitivity to the aggregate shock, β̂i , and the firm-specific labor elasticity, θ2i . Variation in θ2i leads firms to have different exposures to changes in labor market conditions, captured through the cyclicality of wages, ω. To gain intuition, consider the extreme case where all heterogeneity in business cycle exposure comes 2i ω through θ2i , i.e., β̂i = 1 ∀ i. Then, βi = 1−θ . It is straightforward to show that βi is increasing 1−θ2i in θ2i as long as ω < 1, i.e., holding all else equal, labor intensive firms are more exposed to cyclical movements in wages, which in and of itself leads to a higher risk premium.72 Given this simple reinterpretation of beta, a version of the analysis in Section 3 continues to hold. In particular, we can derive an expression for expected stock markets returns that is analogous to (21), but which now also reflects the variation in θ2i – in other words, this type of heterogeneity should be picked up by our empirical measure of variation in risk premia. 72 Donangelo et al. (2018) explore a related mechanism and provide empirical support for the connection between “labor leverage” and risk premia. They also find that a necessary condition for this relationship to hold is that wages are less than perfectly procyclical, i.e., ω must be less than one. 79 How much of the observed beta dispersion can be attributed to variation in production function parameters? Although precisely pinning down its contribution is challenging, we can reach one (likely over-) estimate as follows. First, under the (admittedly strong) assumption that all cross-firm variation in labor’s share of income comes from heterogeneity in θ2i , we have Wt Nit = θ2i . This is likely to be an upper bound, since there are many other reasons that Yit labor’s share may differ across firms (e.g., labor market frictions or distortions). Donangelo et al. (2018) (Table XII, Panel C) report a cross-sectional standard deviation of labor’s share among Compustat firms of 0.186. Using this as an estimate of the dispersion in θ2i , we can calculate the implied beta dispersion. Specifically, we assume that θ2i is normally distributed and discretize the distribution on a seven point grid following the method suggested in Kennan (2006). This yields a range of values for θ2i from 0.31 to 0.84 with standard deviation 0.183. 2i ω . The standard deviation of the betas is 1.35, Next, we compute the implied betas as βi = 1−θ 1−θ2i which represents about 12% of the overall standard deviation of betas in Section 4.73 Heterogeneous markups. As is well known in the literature, the production function in expression (8) with decreasing returns to scale is isomorphic to a revenue function that arises with monopolistically competitive firms that produce differentiated products and face constant elasticity demand functions. Specifically, assume that demand and production for firm i take the forms Qit = Pit−µi , Yit = Xtβ̃i Z̃it Kitθ̃1 Nitθ̃2 where µi denotes the (potentially firm-specific) elasticity of demand and θ̃j , j = 1, 2 the technological parameters in the production function, which for this section are assumed to be common across firms. It is straightforward to derive the following expression for firm revenues: Pit Yit = Xtβ̂i Ẑit Kitθ1i Nitθ2i 1 µi 1− µ1 1 µi where β̂i = 1 − β̃i , Ẑit = Z̃it and θji = 1 − θ̃j , j = 1, 2. With these reinterpretations of parameters, this is equivalent to (47) (there, the common price of output is equal to one). Note that for the case of a common demand elasticity, i.e., µi = µ, the analysis from Section 3 goes through exactly. With heterogeneity in demand elasticities, the analysis takes the same form as with technology heterogeneity – variation in technology and markups show up in the same way. Thus, markup dispersion across firms is an additional candidate for heterogeneous exposures and, indeed, should be picked up in our measures of firm-level risk i 73 David and Venkateswaran (2019) investigate technology heterogeneity in detail in a related framework and provide a sharper upper bound on the extent of this heterogeneity. We have also used their estimate for Compustat firms and found similar, though somewhat smaller, results. 80 premia. All else equal, firms facing a high demand elasticity (so setting a low markup, which i ) respond more strongly to shocks and so show greater sensitivity to them. is equal to µiµ−1 Even with no additional heterogeneity in β̃i , the firm’s beta in the revenue function is given 1 by β̂i = 1 − µ1i = markup , i.e., is the inverse of the markup. How much of the measured beta i dispersion can variation in markups explain? Recent estimates of the within-industry standard deviation of (log) markups among Compustat firms yield values of about 0.20 (e.g., David and Venkateswaran (2019)).74 Following a similar approach as in our analysis of technology heterogeneity, we can compute the resulting dispersion in betas. Specifically, we discretize the distribution of log markups on a five point grid. The lowest value on the grid implies a markup less than one, which we set to 1.01. We choose the standard deviation of the distribution so that the standard deviation of the truncated distribution is 0.20. This yields a range of markups from 1.01 to 1.63. After optimizing over labor, the implied beta for firm i is given 1−θ̃2 ω i −θ2i ω = markup . We set θ̃2 to a standard value of 0.67 and compute the standard by βi = β̂1−θ 2i i −θ̃2 deviation of these betas, which is 0.71. This accounts for about 6% of the overall standard deviation calculated in Section 4. Other parameter heterogeneity. We have also examined the potential effects of two other forms of parameter heterogeneity – in the depreciation rate, δ, and the properties of idiosyncratic shocks, i.e., their persistence and volatility, ρz and σε̃2 . To a first-order, the latter two parameters do not enter our estimates of beta dispersion/risk premia anywhere – idiosyncratic shocks, while extremely important in determining firm dynamics, do not affect covariances and so do not lead to risk premia. Expression (21) shows that δ does play a role in determining expected stock returns (through the denominator of ψ, which, with heterogeneity in δ will be firm-specific), but a numerical simulation suggests these effects are small. For example, allowing δ to range from 0.04 to 0.16 (so half and double the baseline value) generates a spread in expected returns of 1.6%, which is modest relative to the extent of expected return dispersion in the data. For example, Table 5 in Appendix B.2 shows that interquartile range of expected returns is almost 12%. Halving/doubling ρz and σε̃2 also leads to only limited spreads in expected returns (1.3% and 2.3%, respectively). These results suggest that unobserved heterogeneity in these parameters seems unlikely to account for the substantial dispersion in risk premia observed in the data. Moreover, note that our calculation of productivity betas in Section 4.4 is independent of these parameters, further emphasizing that the majority of the empirical beta dispersion is 74 The statistics reported in Edmond et al. (2018) imply a roughly similar figure. Haltiwanger et al. (2018) find the same value using a different empirical method (namely, estimating a variable elasticity of substitution demand system using detailed data on prices and quantities) on a sample drawn from the Census of Manufactures. 81 unlikely to stem from these parameters.75 Heterogeneous demand sensitivities. SIC 5812 is defined as “Establishments primarily engaged in the retail sale of prepared food and drinks for on-premise or immediate consumption” and includes food service establishments ranging from fast food (e.g., McDonalds) to high-end restaurants (e.g., Ruth’s Chris Steak House). We gathered data (where available) on average check per person (usually proxied by total check divided by the number of entrees ordered) from publicly available sources, including company SEC filings and surveys performed by Citi Research and Morgan Stanley. The data are generally from 2014 to 2015. Matching these prices to the Compustat data yielded a sample of 20 publicly traded firms in SIC 5812 with data on prices, betas, expected returns and MPK. We first extracted the set of all firms in SIC 5812 for which we have sufficient quarterly observations to compute our measures of risk exposure (20 consecutive quarters are required). Next we obtained data on average check per person. These data are primarily from surveys performed by Citi Research and Morgan Stanley, downloaded from https://finance.yahoo. com/news/much-costs-eat-every-major-201809513.html, dated September 2015. Of the firms in the Compustat sample, this gave us pricing data for 8 firms: McDonalds (MCD), Wendy’s (WEN), Sonic (SON), Chipotle (CHP), Cheesecake Factory (CHE), Texas Roadhouse (TEX), BJ’s Restaurants (BJR) and Red Robin (ROB). We supplemented these data with figures reported in company 10-K filings with the SEC for the year 2014 for Jack in the Box (JCK), Panera Bread (PAN), Carrol’s Restaurant Group (the largest Burger King franchisee; BKG), Chili’s (CHL), Cracker Barrel (CRA), Bob Evans (BOB), Ruth’s Chris Steakhouse (RUT), Denny’s (DEN), Famous Dave’s (FAM), Kona Grill (KON), Granite City (GRA) and Darden (DAR). Data on Granite City are from its 2013 10-K filing, where we calculated the average of the reported range across markets. Darden owns Eddie V’s, Capital Grille, Seasons’s 52, Bahama Breeze, Olive Garden, Longhorn Steakhouse, Fleming’s, Bonefish Grill, Carraba’s and Outback Steakhouse. It reports an average check for each of these chains separately, which we combined into a single value using a sales-weighted average. The largest among this group is Olive Garden. We excluded chains that were confined to a very limited geographic area and those for which we could not obtain average check data. In total, our sample consists of 20 75 We have also explored the effects of adjustment costs alone by simulating a panel of firms with a common beta and computing the mean of period-by-period expected return dispersion. We find that adjustment costs on their own lead to very little dispersion in expected returns (average standard deviation of about 0.015 compared to 0.127 in the data), suggesting that it is unlikely that our estimates of beta are reflecting the effects of these costs. We have also verified that this result goes through for larger levels of these costs (e.g., ξ = 3, compared to 0.04 in the baseline). Note that this is in line with our approximate expression for expected returns in equation (21): that expression show that to a first-order, expected returns are completely independent of adjustment costs. 82 firms. We computed average betas, expected returns and MPK for these firms over the period 2010-2015. Figure 2 illustrates the main results from this exercise. The top two panels of the figure plot average check against CAPM and demand betas, along with the lines of best fit. Both plots show a strong positive relationship – higher quality restaurants, as proxied by price, have higher exposure to aggregate shocks, measured using either stock market or operating data. Firms on the low end include McDonalds (MCD), Burger King (BUR), Wendy’s (WEN), Sonic (SON), etc., and towards the higher end Kona Grill (KON), Famous Dave’s (FAM) and Cheesecake Factory (CHE). The highest-price restaurant in the sample is Ruth’s Chris Steak House (RUT).76 The bottom two panels of the figure go one step further and additionally link quality to measures of expected returns and MPK. Again, there is a strong positive relationship: higher quality restaurants – which the top panel shows tend to be those with higher exposure to aggregate shocks – have higher expected returns and MPK. Table 10 presents the full set of correlations across average check, betas, expected returns and MPK for this set of firms (we also add a measure of beta constructed from the Fama-French factors, which gives similar results). The table shows strong positive correlations between average check and the various beta measures, as well as between average check and returns and MPK. Further, the positive correlations between beta, expected returns and MPK show that high beta and high expected return firms tend to have MPK. However, Figure 2 neatly summarizes the key message – differences in the responsiveness of firm-level demand to aggregate conditions due to quality variation and “trading down” seems a promising explanation of beta dispersion. Table 10: Correlations – SIC 5812, Eating Places Ln(avg. check) CAPM Beta Demand Beta FF Beta Expected Return Ln(MPK) Ln(avg. check) 1.00 0.47 0.64 0.61 0.63 0.65 CAPM Beta Demand Beta FF Beta Expected Return Ln(MPK) 1.00 0.41 0.85 0.42 0.65 1.00 0.47 0.37 0.63 1.00 0.57 0.77 1.00 0.64 1.00 76 Ruth’s Chris is somewhat of an outlier with a price of $76.00 per person, about three times larger than the next highest. We have verified that omitting Ruth’s Chris does not significantly change the results. 83 4.5 RUT GRA 2.50 RUT CHL 4.0 Demand Beta CAPM Beta 3.5 2.00 FAM ROB KON CHE DEN 1.50 CHL BOBCHP GRA TEX BJR PAN CRA SON BUR WEN 1.00 2.5 CHE DEN BOB CRA ROB BUR WEN JCK PAN SON CHP MCD FAM TEX 1.0 2 2.5 3 3.5 4 4.5 1 1.5 2 Ln(avg. check) 2.5 3 3.5 4 4.5 Ln(avg. check) 0.2 RUT 1.5 TEX DEN RUT 0.15 CRA 0.1 KON BJR DAR ROB DEN CHL CHE 0.05 FAM WEN SON MCD 0 1.5 TEXCHE KON DAR ROB GRA BOB 0.5 BJR 0 GRA BUR 1 CRA BOB CHP PAN JCK -0.05 CHLFAM 1 Ln(MPK) Expected Stock Return DAR BJR 1.5 DAR MCD 1.5 3.0 2.0 JCK 1 KON 2 2.5 3 3.5 4 4.5 Ln(avg. check) BUR WEN JCK SON MCD 1 1.5 2 CHP PAN 2.5 3 3.5 4 4.5 Ln(avg. check) Figure 2: Average Check, Beta, Expected Returns and MPK in SIC 5812, Eating Places 84