The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
Federal Reserve Bank of Chicago Option-Implied Risk Aversion Estimates Robert R. Bliss and Nikolaos Panigirtzoglou Revised March, 2003 WP 2001-15 Forthcoming in Journal of Finance Option-Implied Risk Aversion Estimates by Robert R. Bliss Research Department Federal Reserve Bank of Chicago 230 South La Salle Street Chicago, IL 60604-1413 U.S.A. (312) 322-2313 (312) 322-2357 Fax Robert.Bliss@chi.frb.org and Nikolaos Panigirtzoglou Monetary Instruments and Markets Division Bank of England Threadneedle Street London EC2R 8AH U.K. +44-207-601-5440 +44-207-601-5953 Fax nikolaos.panigirtzoglou@bankofengland.co.uk February 26, 2003 First Draft: November 2, 2001 JEL Classifications: G13, C12 We are particularly grateful for helpful discussions with Lars Hansen, for comments by Jeremy Berkowitz, Avi Bick, Peter Christopherson, George Kapetanios, Jesper Lindé, David Marshall, Marti Subrahmanyam, and seminar participants at the Bank of England, the Federal Reserve Bank of Chicago, Indianapolis University–Purdue University— Indianapolis, McGill University, the Sveriges Riksbank, the University of Georgia, DePaul University, the 2002 Derivatives Securities Conference, 2002 Bachelier Finance Society Congress, the 2002 European Financial Management Association Annual Meeting, the 2002 European Finance Association Annual Meeting, and the Warwick Business School, Financial Options Research Center 2002 conference on Options: Recent Advance, for the guidance and suggestions of the editor Richard Green, and the referee. We thank Darrin Halcomb for his excellent research assistance. Any remaining errors are our own. The views expressed herein are those of the authors and do not necessarily reflect those of the Federal Reserve Bank of Chicago or the Bank of England. This paper was previously titled “Recovering Risk Aversion from Options.” Option-Implied Risk Aversion Estimates ROBERT R. BLISS and NIKOLAOS PANIGIRTZOGLOU Abstract Using a utility function to adjust the risk-neutral PDF embedded in crosssections of options, we obtain measures of the risk aversion implied in option prices. Using FTSE 100 and S&P 500 options, and both power and exponential utility functions, we estimate the representative agent’s relative risk aversion at different horizons. The estimated coefficients of relative risk aversion are all reasonable. The relative risk aversion estimates are remarkably consistent across utility functions and across markets for given horizons. The degree of relative risk aversion declines broadly with the forecast horizon and is lower during periods of high market volatility. Estimating the representative agent’s or the market’s degree of risk aversion from securities prices has a long history. However, it is only recently that scholars have begun using options data to do so. Options provide a particularly promising context for studying risk preferences. Stocks are infinitely lived and so inferences must be drawn from the discounted stream of cash flows over an indefinite horizon. Usually this involves additional assumptions as to how those cash flows evolve (e.g., constant growth of dividends). Since only one value, the discounted present value of all cash flows, is known, no inferences are possible about variations in preferences over different horizons. Options on the other hand have a fixed expiry date at which payoffs are realized.1 Furthermore, options contracts exist for different investment horizons. Options thus permit studying preferences over specific horizons and simultaneously over multiple horizons. Futures contracts also share this fixed-horizon characteristic. Options, however, provide a spectrum of observations for each expiry date on any given observation date— one for each quoted strike price—while futures provide only a single statistic for each expiry date/observation date pair. The multiplicity of prices for different payoffs on the same underlying asset provided by options allows us to construct a density function for 1 American options present a somewhat more complicated investment horizon, but only to the extent that early exercise is optional. Still, even in that case American options allow a greater specificity of investment horizon than stocks do. 1 the distribution of possible values of the underlying asset. In contrast, single-datum stock and futures prices allow inference only about the mean of the distribution, unless additional assumptions are made linking the observed time-series to a stochastic process or density function functional form. This paper uses the informativeness of options, together with a new method of inferring risk aversion implied by security market prices, to present unique evidence of the term structure of risk preferences. We confirm this across markets. We also present evidence that the implied risk preferences are volatility dependent. Cross-sections of option prices have long been used to estimate implied probability density functions (PDFs). These PDFs represent forward-looking forecasts of the distributions of prices of the underlying asset. Option-derived distributions have the distinct advantage of (usually) being based on data from a single point in time, rather than being taken from an historical time-series. As a result, these PDFs are theoretically much more responsive to changing market expectations than are density forecasts estimated from historical time-series data using statistical density estimation methods or deriving density forecasts from parameterized time-series models. Unfortunately, theory also tells us that the PDFs estimated from options prices are risk-neutral PDFs. If the representative investor who determines options prices is not risk-neutral, these PDFs need not correspond to the representative investor’s (that is, the market’s) actual forecast of the future distribution of underlying asset values. If investors are rational, their subjective density forecasts should correspond, on average, to the distribution of realizations; that is, their subjective density forecasts should coincide with the objective or physical densities from which realizations are in fact drawn. Thus, one way to test whether risk-neutral densities reflect market expectations is to test whether they provide accurate density forecasts. If risk-neutral PDFs do not forecast accurately, we may infer that the difference between the risk-neutral and the accurate or objective forecast arises from the risk aversion of the representative agent. We can then use this difference to infer the degree of risk aversion of the representative investor. 2 A number of papers have examined the density forecast accuracy for different option-derived risk-neutral PDFs.2 Most of these studies have rejected the hypothesis that option-derived risk-neutral PDFs are accurate forecasts of the distribution of future values of the underlying asset. Thus, evidence suggests that implied PDFs cannot reliably be used to infer market expectations concerning the future distribution of the underlying asset. This is not entirely surprising as there is a large literature establishing the existence of risk premia in market prices, particularly equity markets. Nonetheless, numerous other papers have proceeded to interpret risk-neutral PDFs as representative of market expectations.3 The theoretical relation between the state prices and state probabilities is well understood (see Huang and Litzenberger, 1988, eq. 6.2.1), as is the extension of this idea to continuous distributions and the relation between risk-neutral and objective density functions.4 Subject to certain conditions such as complete and frictionless markets and a single asset, the risk-neutral density function, q( ST ), is related to the objective density function, p ( ST ), by the representative investor’s utility function, U ( ST ), as follows: p ( ST ) U ′( ST ) =λ ≡ ζ( ST ), q ( ST ) U ′( St ) (1) where λ is a constant. The function ζ( ST ) is the pricing kernel. Thus, knowing any two of the three functions—the risk-neutral density, the objective density, and/or the pricing kernel (equivalently the utility function)—permits us to infer the third. The methodology in most previous studies of options and risk aversion has been to separately estimate the risk-neutral density from options prices and the objective (or statistical) density function from historical prices of the underlying asset, use these two separately derived functions to infer the pricing kernel, and then draw conclusions from 2 Anagnou, Bedendo, Hodges, and Tompkins (2002) provide an excellent review of previous papers before adding their own results. 3 See Bliss and Panigirtzoglou (2002) for a partial list. 4 The origins of the development of this result has proved difficult to trace. The formulation used here is taken from Ait-Sahalia and Lo (2000). 3 the implied relative risk aversion function.5 These papers have typically imposed an assumption of stationarity on the statistical density function or the parameters of the underlying stochastic process to facilitate estimating the objective density function from historical data. The degree of stationarity assumed varies. Some studies, such as AitSahalia and Lo (2000) and Jackwerth (2000), assumed that the distribution of returns is constant over a long period. Other studies such as Ait-Sahalia, Wang, and Yared (2001), and Rosenberg and Engle (2002) fit a stochastic process with parameters assumed to be constant over long periods; in other words, that the conditional densities are timeinvariant. Some also pool option cross-sections from different observation dates under the assumption that the risk-neutral PDF is also stationary.6 This ignores evidence to the contrary.7 These stationarity assumptions are not implied or required by the theory, they are made for practical econometric reasons. Unfortunately, the resulting risk-aversion functions are somewhat inconsistent with theory: either U-shaped or generally declining but not monotonically so. We can directly estimate time-varying risk-neutral PDFs without imposing strong a priori structures on the data by using single-observation cross-sections of options prices. However, one cannot independently estimate a time-varying statistical density from a time-series of prices without imposing an a priori structure, for instance by assuming that prices follow a particular stochastic process. Unfortunately, the statistical density and/or stochastic process stationarity assumptions made in doing so are subject to several criticisms. Estimated risk-neutral PDFs are rarely consistent with the simple functional forms implied by simple one-factor diffusion models, nor are changes in PDFs 5 See for example Ait-Sahalia and Lo (2000), Ait-Sahalia, Wang, and Yared (2001), Coutant (2001), Jackwerth (2000), Weinberg (2001), Pérignon and Villa (2002), and Rosenberg and Engle (2002). 6 See for example Ait-Sahalia and Lo (2000), Ait-Sahalia, Wang, and Yared (2001), Coutant (2001), and Pérignon and Villa (2002). 7 Daily changes in implied PDFs are readily observable in the data and have been used in numerous event studies [for example: Campa, Chang, and Reider (1997), Melick and Thomas (1997), Campa, Chang and Refalo (1998), Galati and Melick (1999), Gemmill and Saflekos (1999), Shiratsuka (1999), and Söderlind (1999)]; furthermore, several central banks track monthly changes in implied PDFs to infer changes in market sentiment. Though to be absolutely fair, this evidence is indirect and not necessarily conclusive. We know of no study that has directly tested whether it is possible to reject the stationarity of implied risk-neutral PDFs over various time intervals. 4 over time consistent with simple shifts in the mean of the stochastic process. Furthermore, when faced with changing risk-neutral PDFs, any assumed stationarity of the objective PDF becomes questionable. The assumption made in the previously discussed papers that the true statistical distribution is constant begs the question of why then the risk-neutral distributions clearly are not. A stationary statistical distribution requires either that the pricing kernel is time-varying or that investors are irrational; that is, they do not account for the supposedly stationary distribution of prices, in order to explain the clearly time-varying risk-neutral distributions that we observe. Directly testing the stationarity of the statistical distribution requires more data than is usually available, though volatility clustering, price and volatility spikes, and the frequent application of time-varying volatility models and regime-switching models to describe financial time-series points to the strong possibility that true underlying statistical distributions are time-varying.8 An alternative to assuming statistical-distribution or stochastic-process stationarity is to assume risk-aversion function stationarity. We can do this by assuming some well-behaved functional form for the underlying utility function, consistent with most non-options-based studies of market risk aversion.9 Thus, rather than imposing stationarity restrictions on the underlying statistical processes to permit estimating the objective density from a time-series of historical prices, we impose an alternative restriction on the pricing kernel and permit the objective density to time-vary. We assume a parametric form for the utility function, estimate the appropriate risk aversion parameter under the assumption that this value is stationary over the sample period, and then, using time-varying risk-neutral density functions estimated from option prices, derive the time-varying implied objective density functions. Our goal is to find implied subjective density functions that are consistent with both utility theory and rational expectations. 8 Rosenberg and Engle (2002) capture some of these effects through the use of a stochastic volatility model for the return generating process. 9 Bartunek and Chowdhury (1997) combine this approach with a stationary model for the true return generating process to generate a risk-neutral density function which they then use to price options. 5 We investigate these questions using FTSE 100 and S&P 500 options and two different utility functions to adjust risk-neutral PDFs. We find, as others have, that riskneutral PDFs are poor forecasters of the distribution of future values of the underlying indices. We then find the optimal value for the parameters of the utility functions used to construct the subjective PDFs and show that these subjective PDFs are improved forecasters of the distribution of future values of the underlying indices. In most cases these utility-adjusted PDFs can no longer be rejected as “good” forecasts of the distributions of the underlying asset.10 The measures of risk aversion implicit in these adjustments are well behaved, of reasonable magnitude, remarkably consistent across the two markets and the two utility functions considered. We also examine relative performance of alternative sets of PDFs to determine whether one competing alternative is superior to another. The remainder of the paper proceeds as follows: Following a brief description of the Data, the Methodology section outlines the theory underlying the comparison of riskneutral and objective densities, details how we estimate risk-neutral PDFs, adjust them to get the subjective PDFs, and then test these subjective density forecasts to see if these conform to the objective densities from which realizations are drawn. The empirical results are presented and analyzed in the Results section, and the Conclusion follows. The Appendix discusses alternate methods of testing density forecasts, together with the Monte Carlo tests we conducted to select the method used in this paper. I. Data Two sets of equity options contracts are used in this study—S&P 500 options traded on the Chicago Mercantile Exchange (CME) and FSTE 100 options traded on the London International Financial Futures Exchange (LIFFE)—together with data on the underlying asset and the risk-free interest rates needed to price options.11 Data included 10 Of course, “failure to reject” does not mean we should “accept.” Our use of the term a “good forecast” is merely an expositional convenience and should be understood as such. 11 Short Sterling options were also examined but failed to produce enough usable crosssections for meaningful analysis. 6 options expiries from February 18, 1983, through June 15, 2001, for the S&P 500 options and June 19, 1992, through March 16, 2001, for the FTSE 100 index options. The CME S&P 500 options contract is an American option on the CME S&P 500 futures contract. S&P 500 options trade with expiries on the same expiry dates as the futures contracts, which trade out to one year with expiries in March, June, September, and December. In addition, there are monthly serial options contracts out to one quarter. Thus, at the beginning of January, options are trading with expiries in January, February, March, June, September, and December; at the beginning of February options trade with expiries in February, March, April, June, September, and December. Options expire on the third Friday of the expiry month, as do the futures contracts in their expiry months. Prior to March 1987, the S&P 500 futures settled to the value of the S&P 500 index at the close on Friday. Beginning in March 1987, the futures settled to an exchange-determined Special Opening Price on the expiry Friday. For serial months there is no futures expiry and the options settle to the closing price on the option expiry date of the next maturing S&P 500 futures contract. The S&P 500 realizations used in this study to compute options payoffs are the Special Opening Quote for quarterly contracts beginning in March 1987 and the S&P 500 futures settlement price for serial contracts and all contracts prior to 1987. Option quotations used to compute PDFs are the closing prices; the associated value of the underlying is the settlement price of the S&P 500 futures contract maturing on or just after the option expiry date. The LIFFE FTSE 100 option contract used in this study is a European option on the FTSE 100 equity index. Options are traded with expiries in March, June, September and December. Additional serial contracts are introduced so that options trade with expiries in each of the nearest three months. FTSE 100 options expire on the third Friday of the expiry month. FTSE 100 options positions are marked-to-market daily based on the daily settlement price, which is determined by LIFFE and confirmed by the Clearing House. The FTSE 100 options prices used in this study are the LIFFE-reported settlement prices. The quarterly FTSE 100 futures contracts expire on the same date as the options and therefore will have the same value as the index when the option expires. The European-style FTSE 100 contract may thus be viewed as an option on the futures, if one 7 assumes that mark-to-market effects are insignificant. LIFFE reports the futures prices as the value of the underlying asset in their options data. For serial months, LIFFE constructs a theoretical futures price based on a fair value spread over the current futures front quarterly delivery month. In computing FTSE 100 implied volatilities, the value of the underlying asset corresponding to each cross-section of option quotes used in this study is the actual or theoretical futures price reported by LIFFE for that contract. At expiry the options settle to the Exchange Delivery Settlement Price determined by LIFFE by taking the average level of the FTSE 100 index sampled every 15 seconds between 10:10 and 10:30 on the last trading day, after first discarding the highest and lowest 12 observations. This series was used to compute FTSE 100 option payoffs for this study. The risk-free rates used in this study are the British Bankers Association’s 11 a.m. fixings of the 3-month EuroDollar and Short Sterling London Inter-Bank Offer Rate (LIBOR) rates reported by Bloomberg. While this does not provide a maturity-matched interest rate, it can nonetheless be justified by necessity and lack of materiality. Overnight rates (Fed Funds, repos) are not representative of the borrowing costs faced by options traders and are subject to distortions arising from central bank activities. Shortmaturity Treasury bills or LIBOR rates are illiquid or are subject to price distortions due to their use by central banks for reserve management purposes.12 The 3-month LIBOR market has the dual advantages of liquidity and approximating the actual market borrowing and lending rates faced by options market participants. Furthermore, Treasury rates represent lending, not borrowing, rates. In any case, the choice of interest rate has little effect on the methodology. Interest rates are used as an input when converting option prices to implied volatilities (for smoothing) and back again. A 100 basis point (bp) change in the assumed interest rate will produce approximately a 2 basis point change in the measured at-the-money implied volatility for a 1-month contract, increasing to 5 bp at the 6-month horizon. Use of 3-month interest rate as a proxy for the 1-month rate is unlikely to misstate the correct (if unobservable) rate by anything approaching 100 bp. 12 Duffee (1996) provides evidence that short maturity U.S. Treasury securities exhibit idiosyncratic variations that makes them unsuitable proxies for the U.S. risk-free rate. The U.K. does not have a liquid Treasury bill market. 8 A target observation date was determined for horizons of 1, 2, 3, 4, 5, 6 weeks, 1, 2, 3, 4, 5, 6, 9 months, and one year, by subtracting the appropriate number of days (weekly horizons) or months (monthly and 1-year horizons) from the option expiry date. If no options traded on the target observation date, the nearest options trading date was determined. If this nearest trading date differed from the target observation date by no more than 3 days for weekly horizons or 4 days for monthly and 1-year horizons, that date was substituted for the original target date. If no sufficiently close options trading date existed, that expiry was excluded from the sample for that horizon. Options quotes for the target dates were then filtered. Because trading in options markets is asymmetrically concentrated in at- and out-of-the-money strikes, and because the spline algorithm will not accommodate duplicate strikes in the data, we discard inthe-money options. Options for which it was impossible to compute an implied volatility (usually far-away-from-the-money options quoted at their intrinsic value) or options with implied volatilities of greater than 100 percent were also discarded. If there were fewer than five remaining usable strikes in a given cross-section the entire cross-section was discarded. Table I presents the resulting cross-section counts and the range and mean of the strikes per cross-section of the remaining data. In practice, too few cross-sections leads to insufficient power to conduct meaningful tests. Horizons greater than 2 months, were found to have too few usable cross-sections for our study. Another problem we encounter is overlapping data. Serial options, those with expiries of less than three months, expire at monthly intervals. Forecasts and realizations for horizons of less than or equal to one month can therefore be reasonably expected to be independent from one observation/realization interval to the next since the intervals share no common innovations in the price path of the underlying asset. However, for forecasts beyond a one month horizon the time paths from forecast date to realization for consecutive forecasts begin to overlap and thus contain some common innovations in the price path of the underlying asset. Since our tests, see below, are predicated on the null hypothesis that the data are independent, serial dependence arising from overlapping observations undermines the informativeness of the tests. The actual degree of the problem is an empirical question, which we test for. Overlapping data produced serious autocorrelation problems for maturities longer than five weeks. Our final sample 9 therefore consists of filtered cross-sections for weekly horizons of between one and six weeks, with the 6-week included to show the effects of overlapping data.13 II. Methodology Our approach to studying the risk premium implicit in options prices involves looking at the ability of risk-neutral and risk-adjusted or subjective PDFs to forecast future realizations of the underlying asset. Our assumption is that investors are rational and perhaps risk-averse. If we were interested only in point forecasts this would mean that the degree of bias in the price forecast could be interpreted as an indication of the degree of market risk aversion, provided the bias is of the correct sign, rather than an indication that investors are irrational. In this study, we are interested in forecasts of distributions rather than of single point estimate. We will therefore examine whether the realizations over time are consistent with the PDFs implicit in options prices observed at some horizon prior to the respective realizations. Option prices embed risk-neutral PDFs. If these risk-neutral PDFs provide good forecasts of the distribution of future realizations, then we must conclude that there is no evidence of risk premia in the pricing of options. On the other hand, if risk-neutral PDFs are not good forecasters, we can test whether risk-adjusted PDFs provide better forecasts. If this is the case, the relative risk aversion of the utility function used to adjust the risk-neutral PDF provides a measure of the degree of risk aversion of the representative investor. To execute our study we need to be able to: 1. Compute risk-neutral PDFs from option prices, 2. Test the forecast ability of PDFs, both risk-neutral and subjective, and 3. Adjust risk-neutral PDFs to derive subjective PDFs. 13 Serial dependence arising from overlapping data could be addressed by increasing the sampling to, for instance, quarterly expiries. This, however, reduces the sample size to the point that tests lack sufficient power to draw meaningful conclusions. In the presence of 4- and 5-weeks horizons, the 1-month horizon is redundant. Results for 1-month are consistent with the patterns found in the weekly horizon data. 10 A. Estimating the Risk-Neutral Probability Density Function Breeden and Litzenberger (1978) showed that the PDF for the value of the underlying asset at option expiry, f ( ST ), is related to the European call price function by f ( ST ) = e r (T −t ) ∂ 2 C ( St , K , T , t ) , ∂K 2 K =S T where St is the current value of the underlying asset, K is the option strike price, and T-t is the time to expiry. Unfortunately, available option quotes do not provide a continuous call price function. To construct such a function we must fit a smoothing function to the available data. In this paper, we employ a refinement of the smoothed implied volatility smile method developed by Panigirtzoglou and presented in Bliss and Panigirtzoglou (2002).14 The essence of the Panigirtzoglou and related methods is to smooth implied volatilities rather than option prices and then convert the smoothed implied volatility function into a smoothed price function, which can be numerically differentiated to produce the estimated PDF. The Black-Scholes formula is used to extract implied volatilities for European options (FTSE 100) and the Barone-Adesi-Whaley (1987) formula is used for American options (S&P 500). At the same time strike prices are converted into deltas using the Black-Scholes delta and the appropriate at-the-money implied volatility, thus producing a series of transformed raw data points in implied volatility/delta space.15 It is important to note that the use of the Black-Scholes and Barone-Adesi-Whaley formulae is solely to 14 Numerous methods have been developed for extracting PDFs from option prices. Bliss and Panigirtzoglou (2002) provide a review of many of these. The Panigirtzoglou method itself derives from previous work as discussed in Bliss and Panigirtzoglou. The Panigirtzoglou method was selected for this paper because Bliss and Panigirtzoglou found it to be relatively robust and the method permits calibrating the desired smoothness of the extracted PDF. 15 Smoothing the implied volatilities in delta-space rather than strike space was introduced by Malz (1997). Moreover using delta rather than strike groups the awayfrom-the-money implied volatilities more closely together than near-the-money. This permits a greater variation in shape near the center of the distribution where the data are more reliable. Malz shows that this achieves better results than smoothing in strike space. 11 convert data from one space (price/strike) to another (implied volatility/delta) where smoothing can be done more efficaciously. Doing so does not presume that either formula correctly prices options. A weighted natural spline is used to fit a smoothing function to the transformed raw data. The natural spline minimizes the following function: N ∞ min ∑ wi ( IVi − IV (∆i , θ) ) + λ ∫ g ′′( x; θ) 2 dx, θ 2 −∞ i =1 where IVi is the implied volatility of the ith option in the cross-section; IV (∆i , θ) is the fitted implied volatility which is a function of the ith option’s delta, ∆ i , and the parameters, θ, that define the smoothing spline, g ( x;θ); and wi is the weight applied the ith option’s squared fitted implied volatility error. In this paper we use the option vegas, ν ≡ ∂C/∂σ, to weight the observations.16 The parameter λ is a smoothing parameter that controls the tradeoff between goodness-of-fit of the fitted spline and its smoothness measured by the integrated squared second derivative of the implied volatility function. In our preliminary tests we used values of λ ranging from 0.99 to 0.9999 to check the sensitivity of our results to the degree of smoothness we impose on the estimated PDF. These tests indicated that forecast results were insensitive to the choice of λ. We therefore report results based on λ = 0.99. When fitting a PDF it is necessary to extrapolate the spline beyond the range of available data.17 Since we rarely observe extreme realizations of the underlying asset (outcomes beyond the range of available strikes at horizons of less than two months), we have little information as to the appropriate shape to impose on the tails of the density function. Fortunately, the scarcity of tail outcomes also means that the results are not critically dependent on the choice so long as it is economically reasonable. The natural 16 Vega weighing is consistent with homoscedastic pricing errors, such as those resulting from discrete tick size. Furthermore, vega weighing places less weight on away-from-themoney strikes, which is also consistent with observed lower liquidity of away-from-themoney options. 17 Anagnou, Bedendo, Hodges, and Tompkins (2002) use PDFs truncated to the range of available strikes and then rescaled. This unusual procedure avoids extrapolating the tails of the PDF, but cannot handle realizations falling outside the range of strikes available when the PDF was constructed. 12 spline is linear outside the range of available data points and can thus result in negative or implausibly large positive fitted implied volatilities. To prevent this happening, we force the spline to extrapolate smoothly in a horizontal manner. We do this by introducing two pseudo-data points spaced three strike intervals18 above and below the range of strikes in the cross-sections and having implied volatilities equal to the implied volatilities of the respective extreme-strike options. These pseudo-data points are added to the crosssections before the above transformations and spline-fitting take place. Extrapolating the implied volatility function in this manner has the effect of smoothly pasting log-normal tails onto the implied density function beyond the range of strikes. Once the spline, g ( x; θ), is fitted, 5,000 points along the function are converted back to price/strike space using the Black-Scholes formula. The delta-to-strike conversion uses the same at-the-money implied volatility used for the earlier strike-todelta conversion, thus preserving the consistency in the initial data transformation and its inverse. The implied volatility-to-call price conversion uses the implied volatility provided by the fitted implied volatility function to produce a fitted European call price function. The 5,000 points are selected to produce equally spaced strikes over the range where the PDF is different from zero. This range varies with each cross-section, primarily as the price level of the underlying changes. Finally, we use the 5,000 call price/strike data points to numerically differentiate the call price function to obtain the estimated PDF for the cross-section. B. Testing PDF Forecast Ability Each option cross-section produces an estimated PDF, fˆt (⋅), for a single option expiry date. Our goal is to test the hypothesis that the estimated PDFs, fˆt (⋅), are equal to the true PDFs, f t (⋅). The time-series of PDFs generated for a given forecast horizon are all different. Only one realization, X t , is observed for each option observation/expiry date pair. 18 Strike intervals refers to the interval between adjacent quoted option strikes. 13 Under the null hypothesis that the X t are independent and that estimate PDFs are the true PDFs; that is, fˆt (⋅) = f t (⋅), the inverse probability transformations of the realizations, yt = Xi ∫ fˆt (u ) du , −∞ will be independently and uniformly distributed: yt ~ i.i.d. U (0,1). 19 The range of the transformed data is guaranteed by the inverse probability transformation itself, but the uniformity need obtain only if the estimated PDF equals the true PDF. Independence should also be established as most distributional tests assume independence and would generate incorrect inferences if this were not the case, though independence is not always verified in practice. Several non-parametric methods have been proposed for testing the uniformity of the inverse probability transformed data, including the Kolmogorov-Smirnov, Chisquared, and Kupier tests. None of these methods provides a joint test of the assumption that the yi are i.i.d. Berkowitz (2001) has proposed a parametric methodology for jointly testing uniformity and independence. He first defines a further transformation, zt , of the inverse probability transformation, yt , using the inverse of the standard normal cumulative density function, Φ(⋅) : Xt zt = Φ ( yt ) = Φ ∫ fˆt (u ) du , −∞ −1 −1 (2) under the null hypothesis, fˆt (⋅) = f t (⋅), zt ~ i.i.d. N (0,1). Berkowitz tests the independence and standard normality of the zt by estimating the following model 19 Kendal and Stuart (1979), section 30.36, discusses the case where the Xi are i.i.d. and the estimated densities do not depend on the Xi. Where the estimated densities do depend on the Xi, problems may ensue and the inverse probability transform need not be independent or uniform. Diebold, Gunther, and Tay (1998) show that for a special case (arising in GARCH processes) where the true densities depend only on past values of Xi (and no other conditioning information) the i.i.d. uniform result holds. However, in the problem addressed in this paper, the PDFs are estimated from option prices and values of the underlying, which do not include the Xi. We therefore rely on Kendal and Stuart. 14 zt − µ = ρ( zt −1 − µ) + ε t , (3) using maximum likelihood and then testing restrictions on the estimated parameters using a likelihood ratio test.20 Under the null, the parameters of this model should be: µ = 0, ρ = 0, and Var(ε t ) = 1. Denoting the log-likelihood function as L(µ, σ 2 ,ρ), the likelihood ˆ , is distributed χ 2 (3) under the null ˆ ˆ 2 ,ρ) ratio statistic, LR 3 = −2 L(0,1, 0) − L(µ,σ hypothesis. In practice, it is sometimes necessary to test overlapping forecasts, for example 2month-ahead forecasts of monthly realizations. In this case, if the above test rejects, it is possible that the rejection arises from the overlapping nature of the data, which may induce autocorrelation, rather than from problems with the estimated PDFs. This is also true for non-overlapping, but serially correlated, data. Berkowitz therefore tests the ˆ , ˆ ˆ 2 , 0) − L(µ,σ ˆ ˆ 2 ,ρ) independence assumption separately by examining LR1 = −2 L(µ,σ which has a χ 2 (1) distribution under the null. If LR 3 rejects the hypothesis that the zt ~ i.i.d. N (0,1), failure to reject LR 1 provides evidence that the estimated PDFs are not providing accurate forecasts of the true time-varying densities. On the other hand if both LR 3 and LR 1 reject, we cannot determine whether the problem arises from a lack of forecast ability or serial correlation in the data. Failure to reject both LR 3 and LR 1 is consistent with forecast power, though as in all statistical tests failure to reject the null hypothesis does not necessarily mean that the null hypothesis is true. The simple AR(1) model used in the above Berkowitz test captures only a specific sort of serial dependence in the data, though this is the dependence most likely to occur in this case. Berkowitz (2001) shows how to expand the model and associated tests to higher order AR(p) processes. However, this results in increasing numbers of model parameters and reduced power. 20 The log-likelihood function for this model is given in Hamilton (1994), equation (5.2.9). This test does not test the normality of the transformed data per se, but rather that the data are standard normal under the assumption that they are normally distributed. 15 The LR test is uniformly most powerful only in a single-sided hypothesis test. However, as we show in Appendix A in Monte Carlo simulations the Berkowitz test is more reliable than the Chi-squared and Kupier tests in large and small samples under the null hypothesis, and is additionally superior to the Kolmogorov-Smirnov test in small samples when the data are autocorrelated because it is a joint test of uniformity and independence. We therefore use the Berkowitz test in this paper. C. Estimating the Subjective Density Function To compute and test the forecast ability of a subjective density function it is first necessary to hypothesize a utility function for the representative agent and then, following Ait-Sahalia and Lo (2000), use this to convert the estimated risk-neutral density function into a subjective density function. The forecast ability of the resulting subjective density function is then tested in the same manner as the risk-neutral density function. Given an estimated risk-neutral density function and a utility function, equation (1) can be transformed to solve for the implied subjective density function. The resulting subjective density function must be normalized to integrate to one. Thus, q ( ST ) U ′( St ) q ( ST ) q ( ST ) ζ ( ST ; St ) λ U ′( ST ) U ′( ST ) = = p ( ST ) = . q( x) U ′( St ) q ( x) ∫ ζ( x; St ) dx ∫ λ U ′( x) q( x)dx ∫ U ′( x) dx In the normalization process the parameter λ disappears, however any parameters of the utility function itself must be estimated. In this paper, we test subjective density functions derived using two representative agent utility functions: the power utility function and the exponential utility function. In both cases the utility functions, and thus the resulting subjective density functions, are conditioned on the value of the single parameter, γ. In testing the subjective density functions we first selected the value of γ to maximize the forecast ability of the resulting subjective PDFs by maximizing over γ the p-value of the Berkowitz LR3 statistic. Table II provides the functional forms of the power and exponential utility functions and the marginal utility function used to transform the risk-neutral density into 16 the corresponding subjective density, together with the measure of relative risk aversion (RRA) for each utility function. The power utility function has constant relative risk aversion, and the measure of RRA is simply equal to the parameter γ. However, the exponential utility function exhibits constant absolute risk aversion, the parameter γ, rather than constant relative risk aversion. For exponential utility, the RRA is dependent on both γ and the realization ST , which is time varying. Therefore, for exponential utility RRA we report the distribution of the RRA across the sample observations. D. Significance of the RRA Estimates The Berkowitz likelihood ratio statistic has a χ 32 distribution for a fixed γ . However, the process of searching for the optimal level of γ alters the distribution of the test statistic, biasing the likelihood ratio towards unity and thus overstating the p-value. (This does not affect the tests of the risk-neutral densities.) Furthermore, the process of maximizing the Berkowitz statistic over values of γ does not provide a measure of whether the resulting γ is significantly different from zero, only whether the resulting adjusted PDFs can be rejected as correct forecasts of the distribution of future outcomes. Therefore, we need to find a way of correcting the Berkowitz p-values and determining the significance of the estimated values for γ . Unfortunately, the γ- estimation methodology is complex and does not lend itself to simple analysis. Certainly analytic expressions for standard errors are impossible as the likelihood depends on empirical rather than analytical PDFs. In such situations, resampling and Monte Carlo methods provide straightforward approaches to investigating the properties of the estimation procedure and the resulting estimates. We use Monte Carlo to investigate the properties of our methodology under the null hypothesis that representative investor is risk-neutral. To determine whether the properties of the estimator vary with the level of γ we also conduct the same tests under assumed levels of risk aversion comparable to those we estimate using actual data. Bootstrap and cross-validation yield evidence of the sampling variations of the estimated RRAs using actual, rather than simulated data, and thus provide a cross-check of the 17 Monte Carlo results. Cross-validation also provides a systematic test for (single) influential observations. D. 1. Monte Carlo Tests The Monte Carlo tests use the same risk-neutral PDFs employed in our other tests. However, actual outcomes are replaced with pseudo-outcomes repeatedly drawn from risk-neutral PDFs. This process produced a set of outcomes where the true value of γ was know (that is, zero).21 One thousand replications were employed for each contract type/expiry-horizon combination. For each simulated set of outcomes, the Berkowitz pvalue-maximizing value of γ was estimated as described in the previous section. The resulting distributions of p-values and γs give us an idea of the standard error of the estimates which our methodology produces under known conditions. Figure 1 presents the distribution of the p-values and γs from simulations based on risk-neutral PDFs by horizon, utility-adjustment method, and contract type. The box portion of each Tukey plot encompasses the inter-quartile range of the estimates (pvalues or γs ) derived from the Monte Carlo simulations. The lines dividing the box are the mean (dotted) and median (solid) respectively. The “whiskers” are the 10th and 90th percentiles of the distributions and the end points correspond to the 5th and 95th percentiles. The top four panels of Figure 1 present the distribution of the p-values produced by the simulations under the assumption that the Berkowitz statistic LR3 statistic is distributed χ 32 . If the distribution of the Berkowitz LR3 produced by the search process which maximized the LR3 statistic over values of γ was indeed χ 32 , the various percentiles of the distributions of the simulated p-values would appear at the corresponding percentiles indicated on the vertical axes: Thus, the mean and median 21 In preliminary Monte Carlo experiments we found that the distributions of p-values and γ estimates are virtually identical when outcomes are drawn from risk-neutral PDFs and from subjective (utility function adjusted) PDFs constructed using fixed values for γ , excepting for the expected mean shift in the estimated values of γ. This finding was confirmed using rank sign tests for differences in paired distributions. We therefore base our analysis on Monte Carlo simulations using risk-neutral PDFs. 18 should coincide at 0.5, and the box should extend from 0.25 to 0.75, and so forth. This is clearly not the case; the distribution of Monte Carlo generated p-values show significant upward bias, and the distribution is no longer uniform or even symmetric. Therefore, we cannot evaluate the Berkowitz LR3 statistic derived from the utility-adjusted PDFs under the assumption that it is distributed χ 32 . However, we can use the Monte Carlo distribution of p-values to adjust the χ 32 p-values estimated using actual data—we use the frequency with which the Monte Carlo p-values fall short of our actual-data p-values.22 Using the 1-week horizon power utility-adjusted simulations as an example, a p-value of less than 0.5 occurs 30.5 percent of the time. We therefore adjust the p-value for the Berkowitz likelihood ratio statistic downwards from the 0.5 it has under the incorrect χ 32 distribution to the 0.305 we find in the Monte Carlo simulations under the null hypothesis that the outcome data are drawn from the risk-neutral PDFs.23 The distributions of simulated γ estimates, by horizon, utility-adjustment method and contract type, produced by the same simulations are shown in bottom four panels of Figure 1. The distributions show a slight, but not significant, positive bias (recall that the true value for γ is zero). The biases are also small compared to the values of γ estimated from actual outcome data (vide infra). We can use these distributions of γ- values obtained under the assumption that γ = 0, to provide a rough proxy for the statistical significance of the actual-data γ estimates against the null hypothesis that the true value of γ is zero. That is, we can test whether the differences between the risk neutral and utility-adjusted PDFs are significant, even when tests of the forecast ability based on the Because the mapping from the LR3 statistic to the corresponding χ 32 p-value is monotonic, we can work with the Monte Carlo distribution of either statistic to derive the adjusted p-value for our empirical tests. 23 Because there is a monotonic mapping from the Berkowitz likelihood ratio statistic to the χ 32 p-value, adjusting the p-value using the distribution of simulated p-values is equivalent to computing the adjusted p-value of the Berkowitz statistic directly from the distribution of the simulated likelihood ratios. 22 19 p-values of the LR3 statistic fail to reject the hypothesis that the risk-neutral PDFs are “good” forecasts of the distribution of future outcomes of the underlying index.24 D. 2. Bootstrap Tests Monte Carlo provides us with information about the distribution of test results when the data are drawn from a known model—that is when the null hypothesis is indeed true. Complex estimation methodologies involving non-linear optimization techniques which appear to be well behaved under the assumed data structure can sometimes become ill behaved under actual data which invariably differs from the null hypothesis to some degree. It is important therefore to confirm to the Monte Carlo results with actual data, to show that the sampling variation of the estimator is not materially different than the sampling variation derived from the Monte Carlo simulations (particularly if the Monte Carlo results are to be used to infer significance in later tests). Bootstrapping captures the impact of the actual data and potential model misspecification on the reliability of parameter estimates. We applied the bootstrap using the two representative contracts—5-week S&P 500 and 3-week FTSE 100—again with 1,000 replications in each case. Each replication consisted of drawing with replacement a random sample of pairs of densities and associated outcomes from the original sample. Each bootstrap sample was the same size as the original samples (183 and 108 respectively). Each bootstrap sample was then used to estimate power- and exponential-utility γs and pvalues. Since bootstrapping destroys the independence assumption underlying computation of p-values, the distribution of bootstrap p-values is uninformative. However, the distribution of γs , or equivalently RRAs, provides an indication of the sampling variation of these estimates when the full sample is used. We report the distribution of RRAs in Table III, rather than γs, to facilitate comparison across utility functions. RRAs and γs are identical in the case of the power utility function, and differ 24 Efron and Tibshirani (1993) argue that precise estimation of confidence intervals (at say the 95 percent level) requires several thousand replications since the relevant tail outcomes are by definition infrequent. This is impractical in this instance. Therefore, while a few extreme outcomes in 1,000 replications is suggestive of an extremely low probability, we do not wish to assert that the exact confidence level has been determined. 20 by a fixed scalar (the average level of the index at expiry) in the case of exponential utility. The standard deviations of the bootstrap γ- estimates are comparable to the standard deviation of the Monte Carlo γ- estimates (last row of Table III), providing additional support for using the Monte Carlo γ distributions to estimate the significance levels of the actual-data γ- values. If used to compute t-statistics for the observed fullsample estimated values (top row of Table III), these standard deviations suggest significance levels exceeding conventional levels against the null hypothesis that the true RRAs are zero. This is confirmed in the low incidence of resampled datasets that produce estimated RRAs of less then zero. D.3. Cross-Validation Tests Cross validation is useful in checking for sampling error and individual influential observations. Figure 2 presents the results of cross validations of the 4-week S&P 500 contract and the 3-week FTSE 100 contract. The FTSE 100 shows much greater sensitivity to the data than does the S&P 500 contract. This is not surprising as the S&P 500 data set has a larger number of observations and therefore dropping a single data point will naturally have less effect. In both cases no single data point stands out as an outlier, and the sampling variation in the estimated RRAs is consistent with the true RRA being different from zero. In addition, we checked to see if the relative performance of the two utility function adjustments was sensitive to individual outliers. The superiority of the exponential utility adjustment over the power utility adjustment, evidenced by a higher p-value, was not altered by dropping any individual observation. III. Empirical Results The analysis of the empirical results consists of three sequential steps. We first examine the risk-neutral PDFs to determine whether there is evidence that they adequately capture the distribution of ex post realizations. We next risk adjust the riskneutral PDFs and then test these subjective PDFs in the same manner. Conditional on the 21 subjective PDFs providing a better forecast of the distributions of future realizations, we examine the measures of RRA implicit in these risk-adjusted PDFs. Table IV provides the evidence on our first two questions. We cannot reject the hypothesis that the risk-neutral PDFs provide accurate forecasts of the distributions of future realizations for the FTSE 100 contracts at the 1-week horizon. With a p-value of 23 percent, we find no support for the hypothesis that the 1-week horizon risk-neutral PDFs forecast the FTSE 100 densities poorly. However, in the remaining 13 cases the Berkowitz test rejects the hypothesis that risk-neutral PDFs are good forecasts of the distribution of future values of the underlying index. With the exception of the two 6week horizon contracts, we cannot reject the hypothesis that the probability integral transforms are uncorrelated (by examining the p-values of the LR1 statistic). This leads us to conclude that the rejection of the null hypothesis of good forecast ability arises from a poor forecast rather than a violation of the independence assumption underlying the test statistic. This result is consistent with our priors that risk-neutral PDFs are unlikely to provide adequate forecasts—there is simply too much evidence that equity markets price risk. This result confirms the evidence found in most previous studies. These results also demonstrate that the Berkowitz test has sufficient power to reject the good-forecast null. This observation becomes important when we examine the forecast ability of the risk-adjusted PDFs and find very different results. Having previously established that our tests are able to reject in the risk-neutral case, we are more secure in interpreting the failure of the same test to reject in the subjective cases as arising from superior performance of the risk-adjusted PDFs rather than lack of power in our test methodology. The second stage of our analysis asks whether power and/or exponential utilityadjusted PDFs can improve the forecast ability in the sense of producing PDFs that can no longer be clearly rejected as good forecasts of the distribution of future values of the underlying index. We begin with the 6-week contracts. In all four cases—FTSE/S&P; power/exponential adjusted—the subjective PDFs continue to be strongly rejected as good forecasts. However, when we examine the tests for autocorrelation, the LR1 pvalues, we reject the null hypothesis that the underlying probability integral transforms are uncorrelated. This is consistent with the overlapping nature of the data—six-week- 22 ahead forecasts of monthly observations. This autocorrelation may or may not be driving the rejection of the LR3 statistics. In this situation no inference can be drawn from the rejection of the LR3 statistic. This limitation applies equally to the rejection of riskneutral and subjective PDFs at the 6-week horizon.25 For the remaining horizons, the LR1 statistic p-values all fail to be rejected, almost always by comfortable margins. At the 2-week horizon for both contract types and both utility functions, the subjective PDFs are rejected as good forecasts. The same is true for the power utilityadjusted PDFs at the 5-week horizon for both contract types, and the exponential utilityadjusted PDFs barely fail rejection at the 5 percent level. The power-utility-adjusted PDFs for the 1-week S&P 500 contract are also rejected, but the exponential-utilityadjusted PDF cannot be rejected as providing good forecasts. For the 1-week FTSE 100 contract, neither of the subjective PDFs was rejected. For the remaining horizons, both power and exponential utility-adjusted PDFs fail to be rejected at conventional levels of significance. We conclude that overall adjusting risk-neutral PDFs using utility functions results in subjective PDFs that for the most part are reasonable forecasts of future distributions of the underlying index, in the sense that they cannot be rejected. Casual inspection reveals that the power-utility adjusted PDFs are more likely to be rejected or be marginally significant when compared with the exponential-utilityadjusted PDFs. Inspection of the p-values reveals that in 8 cases out of 12 the exponential-utility-adjusted PDFs produce a better goodness-of-fit than do the powerutility-adjusted PDFs, with equality obtaining in one of the remaining four cases. This invites a comparison of the relative forecast ability improvement provided by the two utility functions. However, direct comparison on a case-by-case (horizon/contract-type) basis is not possible. The two models are not nested and even if a parametric test could be constructed it is unlikely to have power in such small samples. We can however use nonparametric tests to check whether the apparent superiority of the exponential utility adjustment is significant in an overall sense, across contract types and horizons. The null hypothesis for such a test is that the two utility functions produce equally good forecasts 25 The overlapping horizon problem is even more severe for longer horizons. We include the 6-week results in our analysis to illustrate the problem and the consequences, and exclude longer horizons because they are similarly uninformative. 23 on average and that therefore we expect to see the exponential-utility-adjusted PDF pvalues exceed the power-utility-adjusted PDF p-values approximately half the time, rather than 8 of 12 times. Assessing the statistical significance of the 8 of 12 outcomes result is however complicated by the fact that while the PDFs used to compute the probability integral transforms differ across horizons, the same realizations are involved for all horizons. This causes the probability integral transforms to be correlated across horizons. For this reason, we cannot apply a simple binomial distribution to determine the significance of the 8 of 12 result. To solve this problem, Monte Carlo simulations were used to determine the probability of observing 8 instances of exponential utility PDFs achieving higher Berkowitz statistics out of 12 cases under the assumption that the data were drawn from identically distributed, but cross-horizon correlated data. To mimic the two parallel sets of inverse probability transforms resulting from adjusting the risk-neutral PDFs using power and exponential utility functions, fourteen paired sets (A and B) of uniformly distributed data were generated. To capture the correlation resulting from common expiry-date realizations of the underlying index, we imposed the same correlation structure on the pseudo-data as the actual inverse probability transforms had, and we matched the series lengths to the data. While replicating the actual inverse probability transforms in sample size and correlation structure, each of the paired sets of pseudoinverse probability transforms were constructed to have identical distributions, so the null hypothesis that the pairs of Berkowitz statistics had the same expectation was true by construction. Pairs of Berkowitz statistics were then computed for each of the 12 pairs of constructed series, and the frequency of Berkowitz(A) > Berkowitz(B) was noted. The process was repeated 10,000 times. These simulations show that, given the correlation structure in the data, if the power- and exponential-utility adjustments were equally efficacious we would observe one adjustment method beating the other (higher p-values) 8 times out of 12 possibilities only 9.0 percent of the time. We can therefore reject at the 10 percent significance level the hypothesis that the power-utility function does equally as good a job as the exponential-utility function in improving the forecast ability of risk- 24 neutral PDFs.26 This indicates that, at least for this market, the exponential-utility function provides a better fit of the representative investor’s utility function.27 Many studies assume the representative investor has power utility because of the mathematical tractability of this utility function and the convenience of constant relative risk aversion. This finding suggests a tradeoff is involved in such a choice. Nonetheless, we will continue to examine both methods of risk adjustment in the proceeding analysis. The estimated values of γ for each contract type, horizon, and utility function are presented in Table V, along with an indication of their approximate levels of significance against a null hypothesis that the true value is zero based on the Monte Carlo experiments described in the previous section. Excepting only the longer-horizon FTSE cases we can reject the hypothesis that the estimated gammas are equal to zero. Even in cases where the adjusted PDFs were rejected as good forecasts of the distribution of future values of the underlying asset, for instance both 5-week power-utility-adjusted cases, the forecast performance maximizing values of γ were significantly different from zero. These results provide confirmation of the non-parametric tests in footnote 27 that showed that utility adjusting PDFs improves the overall forecast ability of risk-neutral PDFs. The goal of our estimation methodology is to obtain estimates of the market’s risk aversion, which we measure by the relative risk aversion of the representative investor. For the power utility function this is simply γ itself. For the exponential utility function we need to multiply γ by the level of the underlying asset. The top panel of Table VI presents the “all observations” RRAs corresponding to the results just discussed. 26 Given the small sample size, 12 comparisons, a 10 percent level of significance is a reasonable threshold. 27 The same analysis can be used to ask whether subjective PDFs improve forecast ability relative to risk-neutral PDFs on an overall basis. While the subjective PDFs do involve an additional parameter, this need not necessarily result in a higher bias-adjusted p-value. Indeed the estimated risk aversion parameter need not be positive, though it always is. The power-utility adjusted PDFs resulted in higher p-values than the risk-neutral PDFs in 9 of 12 cases, and the exponential-utility function p-values were higher in 10 cases. The probabilities of these outcomes under the null hypothesis of “same forecast ability” are 8.8 and 7.5 percent, respectively. We can therefore conclude that even though the adjusted PDFs are occasionally themselves rejected as good forecasts, utility-adjusting risk-neutral PDFs does produce better overall forecasts of the distributions of the futures values of the underlying index. 25 There is close agreement across horizons and contract types between the power utility RRAs and the mean exponential utility RRAs. Furthermore, the RRAs for FTSE 100 and S&P 500 are similar for matched horizons.28 This is not an artifact of the methodology as the samples are entirely distinct and we see variation between RRAs for different horizons. The median exponential-utility RRAs are slightly lower than the mean, reflecting the positive skew in the distribution of index values. Thus, while the exponential-utility adjustment appears to produce somewhat superior forecasts of density, the (mean) measured RRAs are broadly consistent between the two risk adjustment functions. However, the range of RRAs permitted by the exponential-utility adjustment, which is quite substantial, coupled with the evidence of better fit, suggests that the constant relative risk aversion inherent in the power-utility adjustment may be unduly restrictive, and that constant absolute risk aversion (a characteristic of the exponentialutility function) seems to be more consistent with the data. In all cases, the RRAs are consistent with the moderate values found in most of the other studies shown in Table VII. There is no evidence in our results of the extreme, “puzzle” values found in Mehra and Prescott (1985) and Cochrane and Hansen (1992). Our study thus adds to the accumulating evidence that the risk aversion of the representative agent is not always or necessarily extreme and therefore that the observed equity premium puzzle may be idiosyncratic. These results, taken together demonstrate that the risk premium adjustment needed to move from risk-neutral to objective densities is more subtle than a simple mean shift. Both utility functions produce nearly the same (average) measured levels of risk aversion and hence the same average shift in the means of the risk-neutral densities. Our finding of differential performance of power- and exponential-utility function adjustments demonstrates that the manner in which the entire density changes matters. This is consistent with Anagnou, Bedendo, Hodges, and Tompkins (2002) who tested the predictive power of a mean-shifted log-normal density versus a power-utility-adjusted density and found that the latter did better. Even if crude adjustments to the mean of a 28 This is true for the power utility and mean and median values for the exponential utility. The S&P exponential-utility RRA ranges are greater than the corresponding FTSE 100 ranges because of the greater range found in the values of the S&P index, which in turn arises from the longer time-series available for S&P data. 26 density may eliminate a significant portion of the forecast error, to give economic content to the idea of a risk premium requires relating the necessary changes in the density function to preferences.29 Our methodology necessarily imposes the assumption that the utility function’s coefficient of risk aversion parameter, γ, is constant across the sample. With only one realization per observation it is difficult, if not impossible, to estimate time-varying values for these parameters. However, we can examine the robustness of this constant parameter assumption by dividing the sample into sub-samples and re-estimating the parameters for each sub-sample. Wald or similar tests of differences across sub-samples on a case-by-case basis are unlikely to be statistically significant, recalling that the data did not reject the full sample risk-adjusted PDFs. However, the patterns across horizons and markets of sub-sample differences are consistent and instructive. Rather than divide the two samples by time-period we elected to divide each into two equal-sized sub-samples corresponding to periods of high and low volatility as measured by the implied volatility of at-the-money options. Our rationale is that risk aversion is more likely to vary with the degree of risk than with time. The middle and lower panels in Table VI present the RRAs measured over these two sub-samples. The results are marked and consistent. For every horizon, and for both FTSE 100 and S&P 500, the low-volatility RRAs exceed the high-volatility RRAs by a factor of approximately 3 to 5 times. Taken as a whole and applying the Monte Carlo-based nonparametric test described above, these differences are statistically significant at the 1 percent level if we consider all horizons and both markets and at the 5.5 percent level if we consider only horizons of 3 through 5 weeks. This volatility dependence of the risk aversion measure can also be observed in equity markets. For instance, under assumptions similar to those employed earlier it can be shown that equity returns should be roughly proportional to changes in the volatility of returns, where the constant of proportionality equal the representative agents’ RRA:30 29 In any case, an empirical density function estimated using non-parametric methods cannot be simply “moved over” in order to shift the mean without changing the range over which the PDF is defined. 30 See Appendix B for the derivation of this result. 27 dPt ≈ − γ dVar (rt ). Pt Table VIII presents the slope coefficients obtained by regressing the daily percentage changes in the S&P 500 index against changes in the variance of the index computed using the CBOE’s implied volatility index. From the above model the slope equals the negative of γ, which under the assumed power-utility function equals the RRA.31 Again we see that periods of high volatility correspond to low estimates of RRA. The full sample results are dominated by a few observations where implied volatility spiked sharply with little change in the S&P 500 index. Dropping these outliers reduces but does not eliminate the disparity. This provides independent confirmation of inverse relation between the volatility and risk aversion found in Table VI using different data and a completely different method of estimating RRA.32 A possible explanation for this inverse relation between equity risk and measures of risk aversion lies in our proxy for consumption risk. If consumption risk is more stable than equity risk, as seems likely, than periods of high equity volatility will overstate consumption risk and the representative investor will appear correspondingly less risk averse.33 Similarly, when equity volatility is low it will more closely approximate consumption volatility and the representative investor will appear to be correspondingly more risk averse. Thus, use of equity returns as a proxy for consumption may induce the observed volatility dependence in the derived measures of risk aversion. An alternative 31 The order of magnitude of these values differs from those in Table VI due to differences in scaling of the input data. 32 A few other studies have examined the question of time-varying risk aversion. Rosenberg and Engle (2002) find evidence that measured risk aversion is correlated to macroeconomic factors. Gou and Whitelaw (2001) on the other hand find no evidence that risk aversion is time varying. Normandin and St-Amour (1998) find that taste shocks effect risk premia but that market risk does not. Han (2002) finds that the market risk premium is negatively related to volatility risk. 33 Numerous studies have documented that equity volatility exceeds consumption volatility and that equity volatility is itself volatile. We know of no study, however, of the volatility of consumption volatility. Given the comparative long sampling intervals of consumption data and resulting few observations to work with this is not surprising. Our conjecture depends only on the assumption that during periods of high(low) equity volatility the differences between equity volatility and consumption volatility increase(decrease), which seems reasonable. 28 explanation is that the representative investor is changing systematically as volatility changes. This might happen if during periods of high volatility the more risk-averse investors left the market resulting in a lower average level of risk aversion amongst the remaining investors. The final hypothesis is that in fact the representative investor has volatility-dependent risk aversion. The first, the model-error-due-to-proxy hypothesis, suggests that the higher risk aversions measured during periods of low volatility are closer to the mark. The second hypothesis suggests potential problems that arise in aggregating across investors with dissimilar risk aversions. If a time-varying mix of market participants changes the characteristics of the representative agent—that is the representative investor himself changes—then time-invariant representative investor models may be insufficiently rich to capture the empirical regularities.34 The last hypothesis suggests a more fundamental problem of identifying the link between volatility and risk aversion. While it is not inconceivable that a representative investor might become more risk averse as risk declines, intuition suggests exactly the opposite. Deciding amongst these hypotheses is beyond the scope of this paper. Our methodology is necessarily wedded to the use of equity risk to proxy for consumption risk, and the development of theoretical pricing models to aggregate time-varying heterogeneous agents or a single representative agent with volatility-dependent risk aversion is apt to prove a challenge to theoreticians. Nonetheless, our volatility-dependent risk aversion estimates, confirmed as they are in stock returns, provide an intriguing insight into the determinants of asset prices. Returning to Table VI, the RRAs estimated over the full samples are generally declining with the forecast horizon, excepting the consistently anomalous 2-week horizon. If we focus on the 3–5 week results, which show the clearest contrast between risk-neutral and risk-adjusted forecast performance, the RRAs decline by a factor of slightly less than 2 over that range of forecast horizons. In the at-the-money implied volatility-based sub-samples, the overall tendency of the estimated RRAs to decline with increasing horizon persists, though it is less consistently monotonic. Again errors 34 Constantinides’ (2002) criticism of pricing models based on aggregate rather than idiosyncratic consumption is another challenge to the heretofore dominant representative investor paradigm. 29 resulting from the use of equity index to proxy for wealth, and indirectly consumption, are a possible explanation. Excess volatility in equity prices may attenuate at longer horizons, resulting in a better proxy. This plausible explanation is unfortunately inconsistent with the fact that over the 1- to 6-week range of horizons under consideration the observed annualized volatility is essentially flat. An alternative explanation is that skewness is priced and we do indeed see that annualized skewness declines over the range of horizons in this study. Lastly, the strong horizon-dependence in estimated RRAs may result from short-horizon investors being more risk-averse. Long horizon investor can take steps to recover from adverse shocks, including smoothing consumption and increasing non-investment income (working harder), while short horizon investors may have less flexibility. It is therefore plausible that risk aversion may be horizon dependent. Our full-sample results show that, at least for forecast horizons of 3 to 5 weeks, the risk-neutral distributions provide poor forecasts of future densities while the subjectively adjusted densities provide reasonably good (i.e. not statistically rejectable) forecasts. An obvious question is how much the risk-neutral and subjective densities differ. One measure of this is to look at the tail percentile points under the risk-neutral and subjective distributions. The estimation of tail-percentile points is of particular importance in risk management where value-at-risk is widely used. Suppose we were to compute the 1-percentile value under the (rejectable) risk-neutral density forecast each period. These values of the underlying will have different percentile values under the (not rejectable) subjective densities, and the corresponding subjective percentiles of the riskneutral 1-percentile values may vary from observation to observation. For instance, at the 2-week horizon the values of the FTSE 100 corresponding to the 1-percentile of the riskneutral density measured each observation period have subjective cumulative probabilities (percentiles) ranging between 0.2 and 0.8 percent for the power-utilityadjusted densities and between 0.2 and 0.9 percent for the exponential-utility-adjusted 30 densities.35 In all cases, specific loss levels have lower probabilities under the utilityadjusted densities than under the risk-neutral densities. Thus, reliance on risk-neutral densities to estimate and hold capital against a 1 percent value-at-risk would be unduly conservative (and expensive) for long equity positions, and would understate the risk and required capital for short positions. Whether these differences are material depends on the particular application. These differences may be economically unimportant for an unlevered equity portfolio, while for a highly levered or equity derivative portfolios these differences could be crucial to the sound management of risk. These are, of course, average results. The high-low implied volatility results presented in Table VI show that the reliance on risk-neutral densities would be less problematical during periods of high volatility and more problematical during periods of low volatility. The difference between the mean of the risk-neutral and the subjective PDFs, normalized by one of the means (we use the risk-neutral PDF mean), is an approximate measure of the equity risk premium. Figure 3 plots the time-series of the 4-week horizon risk premia for the S&P 500 contract. The same data is presented in both panels with differing scales for clarity. Until 1997, the exponential-utility-estimated risk premium was less than that estimated using a power-utility adjustment. Since 1997, this relation has been reversed. Changes in the risk premia appear to be correlated across riskadjustment methods, as one would expect. However, differences in estimated risk premia can be large. For instance, during the 1987 stock market crash the power-utility-adjusted PDF suggested a risk premium nearly three times as large as that estimated using an exponential utility function to adjust PDFs. This spike results from the subjective PDFs having markedly higher variances during the 1987 crash (power: 0.33; exponential: 0.31) than the corresponding risk-neutral PDF (0.27). 35 The differences between the mean 2-week horizon FTSE 100 risk-neutral 1-percentile point (3,975) and the corresponding power and exponential-utility-adjusted 1-percentile points (4,010 and 4,015) is a small percentage of the mean level of the index. However, when compared to the mean absolute change in the index level over the 2-week horizon (85) the 1-percentile point differences (35/45) are large. Comparisons for other horizons /contracts are similar. 31 Figure 4 compares the standard deviations and skewness coefficients implied by the subjective PDFs against those from the risk-neutral PDFs for one contract/horizon (S&P500; 4-weeks). Results for other contracts and horizons are similar. Figure 4 shows that for most observations second and third moments do not differ substantially between risk-neutral and subjective PDFs. The exception is the September 16, 1987, observation which shows up as an outlier on the scatter plots. Nonetheless, the differences are sufficient to induce a statistically significant difference in the forecast ability of the subjective and risk-neutral PDFs and a time-varying equity risk premium of around 10 percent per annum for most of the 1983 to 2001 period. IV. Conclusions Options prices embed market expectations of the distribution of futures values of the underlying asset. This can provide potentially useful information for risk managers and analysts wishing to extract forecasts from security market prices. However, the riskneutral density forecasts that are produced from options prices cannot be taken at their face value. We have shown, consistent with the work of others, that risk-neutral PDFs estimated from S&P 500 and FTSE 100 options do not provide good forecasts of the distribution of future values of the underlying asset, at least at the horizons for which we can obtain unambiguous results. Theory tells us that if investors are risk averse and rational, the subjective density functions they use in forming their expectations will be linked to the risk-neutral density functions used to price options by a pricing kernel. Theory also suggests certain properties this pricing kernel might be expected to have. We have employed two widely used, and theoretically plausible, utility functions to infer the unobservable subjective densities by adjusting the observed risk-neutral densities. Our criterion in making this adjustment is to choose the risk aversion parameter that produces subjective densities that best fit the distributions of realized values. That is, we assume that investors are rational forecasters of the distributions of future outcomes and thus the risk aversion parameter value that best fits the data is most likely to correspond to that of the representative agent. In applying this methodology, we assume that investors’ utility functions are stationary. This contrasts with the assumption made in previous papers that the statistical 32 distribution was stationary. The subjective density functions derived under our assumption cannot be rejected as good forecasters of the distributions of future outcomes (unlike the unadjusted PDFs), and so this assumption appears to be validated on a practical level, subject to the caveat that there is some evidence of volatility dependence in risk aversion estimates. The coefficient-of-risk-aversion estimates obtained by our methodology are comparable to those obtained in most previous studies. There is little evidence of risk aversions so high as to constitute a puzzle. We have also been able to establish, we believe for the first time, that the risk aversion estimates are surprisingly robust to differences in the specification of the representative investor’s utility function and to the data set used. We also show that the estimated coefficients of risk aversion decline with the forecast horizon and are higher during periods of low volatility; both results imply that theoretical models may need to evolve to capture these effects. 33 References Ait-Sahalia, Yacine and Andrew W. Lo, 2000, Nonparametric risk management and implied risk aversion, Journal of Econometrics 94(1–2), 9–51. Ait-Sahalia, Yacine, Yubo Wang, and Francis Yared, 2001, Do option markets correctly price the probabilities of movement of the underlying asset? Journal of Econometrics 102(1), 67–110. Anagnou, Iliana, Mascia Bedendo, Stewart Hodges, and Robert Tompkins, 2002, The relation between implied and realised probability density functions, Working paper, University of Technology, Vienna. Arrow, Kenneth J., 1971, Essays in the Theory of Risk Bearing (North Holland, Amsterdam). Barone-Adesi, Giovanni and Robert E. Whaley, 1987, Efficient analytic approximation of American option values, Journal of Finance 42(2), 301–320. Bartunek, K S, and M Chowdhury, 1997, Implied risk aversion parameter from option prices, The Financial Review 32, 107–24. Berkowitz, Jeremy, 2001, Testing density forecasts with applications to risk management, Journal of Business and Economic Statistics 19, 465–74. Bliss, Robert R. and Nikolaos Panigirtzoglou, 2002, Testing the stability of implied probability density functions, Journal of Banking and Finance 26(2–3), 381–422. Breeden, Douglas T. and Robert H. Litzenberger, 1978, Prices of state-contingent claims implicit in options prices, Journal of Business 51, 621–51. Campa, J. M., P. H. K. Chang, and R. L. Reider, 1997, Implied exchange rate distributions: Evidence from OTC option markets, NBER Working Paper No. 6179. Campa, J. M., P. H. K. Chang, and R. L. Reider, 1998, An options-based analysis of emerging market exchange rate expectations: Brazil's real plan, 1994–1997, working paper, New York University. Campbell, John Y., Andrew W. Lo, and A. Craig MacKinlay, 1997, The Econometrics of Financial Markets (Princeton University Press, Princeton, NJ). Cochrane, John H. and Lars P. Hansen, 1992, Asset pricing explorations for macroeconomics, in 1992 NBER Macroeconomics Annual (NBER, Cambridge, MA). Constantinides, George M., 2002, Rational Asset Prices, Journal of Finance 57(4), 1567– 1591. Coutant, Sophie, 2001, Implied risk aversion in options prices, in Information Content in Option Prices: Underlying Asset Risk-neutral Density Estimation and Applications, (Ph.D. thesis, University of Paris IX Dauphine). Diebold, Francis X., Todd A. Gunther, and Anthony S. Tay, 1998, Evaluating Density Forecasts with Applications to Financial Risk Management, International Economic Review 39(4), 863–883. Duffee, Gregory R., 1996, Idiosyncratic variation of Treasury bill yields, Journal of Finance 51, 527–551. Efron, Bradley and Robert J. Tibshirani, 1993, An Introduction to the Bootstrap (Chapman & Hall, New York). Epstein, L. and S. Zin, 1991, Substitution, risk aversion and the temporal behaviour of consumption and asset returns: An empirical analysis, Journal of Political Economy 99, 263–268. Ferson, Wayne E. and George M. Constantinides, 1991, Habitat persistence and durability in aggregate consumption: Empirical tests, Journal of Financial Econometrics 29, 199–240. Friend, I. and M. E. Blume, 1975, The demand for risky assets, American Economic Review 65, 900–922. Galati, G. and W. Melick, 1999, Perceived central bank intervention and market expectations: An empirical study of the yen/dollar exchange rate 1993–96, Bank of International Settlements, Working Paper No. 77. Gemmill, G. and A. Saflekos, 1999, How useful are implied distributions? Evidence from stock-index options, Working paper, City University Business School, London. Guo, Hui and Robert Whitelaw, 2001, Risk and return: Some new evidence, Federal Reserve Bank of St Louis, Working Paper No. 2001–001A. Hamilton, James D., 1994, Time Series Analysis (Princeton University Press, Princeton, New Jersey). Han, Yufeng, 2002, On the relation between the market risk premium and volatility, Working paper, Washington University, St. Louis. Hansen, Lars P. and Kenneth J. Singleton, 1982, Generalized instrumental variables estimation of nonlinear rational expectations models, Econometrica 50, 1269–1286. Hansen, Lars P. and Kenneth J. Singleton, 1984, Errata: Generalized instrumental variables estimation of nonlinear rational expectations models, Econometrica 52, 267–268. Huang, Chi-Fu and Robert H. Litzenberger, 1988, Foundations for financial economics (North-Holland, New York, NY). Jackwerth, Jens Carsten, 2000, Recovering risk aversion from option prices and realized returns, Review of Financial Studies 13(2), 433–467. Jorion, Philippe and Alberto Giovannini, 1993, Time series test of a non-expected utility model of asset pricing, European Economic Review 37, 1083–1100. Kendal, Sir Maurice and Alan Stuart, 1979, The advanced theory of statistics (Macmillan Publishing Co., Inc., New York, NY). Malz, Allan M., 1997, Estimating the probability distribution of the future exchange rate from options prices, Journal of Derivatives 5(2), 18–36. Mehra, R. and Edward Prescott, 1985, The equity premium: A puzzle, Journal of Monetary Economics 15, 145–161. Melick, W. R. and C. P. Thomas, 1997, Recovering an asset’s implied PDF from option prices: An application to crude oil during the Gulf crisis, Journal of Financial and Quantitative Analysis 32, 91–115. Normandin, Michel and Pascal St-Amour, 1996, Substitution, risk aversion taste shocks and equity premia, Working paper, Universite Laval Cite Universitaire, Canada. Pérignon, Christophe and Christophe Villa, 2002, Extracting information from options markets: Smiles, state-price densities and risk Aversion, European Financial Management 8(4), 495–513. Rosenberg, Joshua V. and Robert F. Engle, 2002, Empirical pricing kernels, Journal of Financial Economics 64(3), 341–372. Shiratsuka, S., 1999, Information content of implied probability distributions: Empirical studies of Japanese stock price index options, working paper, Bank of Japan. Söderlind, P., 1999, Market expectations in the UK before and after the ERM crisis, Stockholm School of Economics, SSE/EFI Working Paper Series in Economics and Finance, Working Paper No. 210. Weinberg, Steven A., 2001, Interpreting the volatility smile: An examination of the information content of options prices, Board of Governors of the Federal Reserve System, International Finance Discussion Paper No. 706. Appendix A: Density Forecast Evaluation Testing whether a series of estimated time-varying density functions, fˆt (⋅), equals the true underlying density functions, f t (⋅), when we observe a series of single outcomes, X t , for each density, reduces to testing whether the inverse probability density transforms, yt , yt ≡ Xt ∫ fˆt (u ) du , −∞ are uniformly distributed. The test statistics used for making this determination all assume that the yt are independently and identically distributed (i.i.d.); therefore, independence is a necessarily joint hypothesis with uniformity. Several non-parametric methods have been proposed for testing the uniformity of the inverse-probability-transformed data. The Chi-squared test is based on dividing the [0,1] interval into a number of buckets and then counting the number of times the inverse probability transform falls into each bucket. The result is a series of counts ni , i = 1,..., K , K where K is the number of buckets and N = ∑ ni is the number of observations. Under i =1 the null hypothesis that yt ~ i.i.d. U (0,1), each bucket is expected to contain ni ≡ E(ni ) = N K observations. The Chi-squared test then uses the statistic K χ2 ≡ ∑ i =1 (ni − ni ) 2 which is distributed χ 2 with K-1 degrees of freedom under the null ni hypothesis. The Kolmogorov-Smirnov and Kupier tests are based on the difference between the observed and theoretical cumulative density functions D( yt ) = Fˆ ( yt ) − F ( yt ). In this case the theoretical cumulative density is the uniform, so F ( yt ) = yt . The observed cumulative density is just the rank order divided by the number of observations, F ( yt ) = rank ( yt ) . The Kolmogorov-Smirnov test is the maximum absolute difference T between the observed and theoretical cumulative densities: DKS = max Fˆ ( yt ) − F ( yt ) . 1≤ t ≤T The significance level for an observed value of DKS under the uniformly distributed null is given by Pr ob( DKS > observed ) = QKS (( ) ) T + 0.12 + 0.11/ T DKS , where ∞ QKS ( x) = 2 ∑ (−1) j −1 e −2 j x , 2 2 j =1 which we approximate by summing the first 1,000 terms. The Kupier test sums the maximum positive and negative differences between the observed and theoretical cumulative densities: ( ) ( ) DK = max Fˆ ( yt ) − F ( yt ) + max F ( yt ) − Fˆ ( yt ) . 1≤ t ≤T 1≤t ≤T The significance level for an observed value of DK under the uniformly distributed null is given by Pr ob( DK > observed ) = QK (( ) ) T + 0.155 + 0.24 / T DK , where ∞ QK ( x) = 2 ∑ (4 j 2 x 2 − 1)e −2 j x , 2 2 j =1 which we again approximate by summing the first 1,000 terms. None of these methods provides a test the joint assumption that the yt are i.i.d. Diebold, Gunther, and Tay (1998) suggest testing the independence and uniformity separately using the correlogram for the yt to test for independence and, subject to not rejecting the hypothesis that the data were independent, using a Chi-squared test to test the hypothesis that the probability integral transforms are uniformly distributed. However, as they point out, to separate fully the desired U (0,1) and i.i.d. properties of yt , “we would like to construct confidence intervals for histogram bin heights that condition on uniformity but they are robust to dependence of unknown form,” and “confidence intervals for the autocorrelations that condition on independence but are robust to nonuniformity.” Unfortunately, since there is no serial-correlation-adjusted Chi-squared test of known small sample properties for the uniformity hypothesis, Diebold, Gunther, and Tay are unable to conduct a simultaneous joint test of the i.i.d. and uniformly distributed properties. They use the Kolmogorov-Smirnov (K-S) and the Chi-squared statistics for testing for uniformity but, as they point out, the impact of departures from randomness on the performance of these non-parametric tests is not known. To test the significance of autocorrelations, Diebold, Gunther, and Tay construct finite-sample confidence intervals that condition on independence but are robust to deviations from uniformity by sampling with replacement from the series of the probability integral transforms and building up the distribution of sample autocorrelations. The drawback of their methodology is that they separate the joint hypothesis test into two different tests. The use of the binomial distribution that they mention in their paper is also controversial since the numbers of observations in each bin are not independent and actually follow a multinomial distribution. The Chi-squared test is the appropriate test for the uniform-distribution null hypothesis in this case. Berkowitz (2001) proposes a density evaluation methodology that does provide a joint test of independence and normality. Furthermore, unlike the non-parametric Chisquared, Kolmogorov-Smirnov, and Kupier tests, which discard sample information either by bucketing or by selecting single observations (maximum deviations), the Berkowitz test utilizes all observations. The Berkowitz joint hypothesis test, LR3, described in the body of this paper, tests both uniformity and independence. For diagnostic purposes, restricted forms of the Berkowitz test are possible; for instance, in LR1 tests independence under the assumption that the data are uniform. To choose between alternative methods for testing the uniformity of the inverseprobability transformed data, we used Monte Carlo simulations to test the small sample properties of these different statistical tests, both under the null hypothesis of independently-distributed uniform random variates and when the simulated data were autocorrelated. To do this we needed to generate autocorrelated uniformly-distributed random numbers. Beginning with a series of random standard normal numbers, xt ~ N (0,1), we construct autocorrelated normally-distributed random variables with first order correlations equal to ρ, by creating create the MA(1) variables yt = xt + θxt −1 , where The yt ∼ N (0,1 + θ 2 ), with autocorrelation ρ. To create uniformly distributed numbers u t we transform the yt using the inverse cumulative normal distribution, that is, ut = Φ −1 ( yt ; 0,1 + θ 2 ), where Φ −1 ( x;µ,σ 2 ) is the inverse of the normal cumulative density function with parameters µ and σ 2 , evaluated at x. These uniform random numbers also have a 1stfirst order autocorrelation of ρ. We test three small sample sizes—50, 100, and 200— and three autocorrelations coefficients—0, 0.1, and 0.2—using 10,000 replications for each size/autocorrelation pair. To validate our simulations, we also ran large sample simulations using 10,000 data points in each simulation. For each size/autocorrelation pair we computed the number of times each test statistic exceeded its theoretical 90 and 99 percent levels. The results are presented in Table A1. In large samples (T = 10, 000) and when the null hypothesis is true, i.e. ρ=0, all four tests perform well. However, when the null hypothesis is true, but sample sizes are small, only the Berkowitz and Kolmogorov-Smirnov tests perform well. The Chi-squared test does slightly worse; while the Kupier test is quite unreliable. In large samples with autocorrelated data, the Berkowitz test rejects with near certainty, while the Chi-squared, Kolmogorov-Smirnov, and Kupier tests reject at approximately the same frequency as with uncorrelated data. In small samples, the Berkowitz test rejects slightly more frequently than with uncorrelated data with the rejection rate increasing in the degree of autocorrelation. For the same data the Kolmogorov-Smirnov tests rejects only trivially more frequently than for uncorrelated data. Thus, we conclude that the Kupier test is wholly inadequate for small-sample analysis. The Chi-squared test, while perhaps adequate, is dominated by the Berkowitz and Kolmogorov-Smirnov tests for small sample analysis. While both the Berkowitz and the Kolmogorov-Smirnov tests appear to do well under the null hypothesis in large and small samples, the Berkowitz test has an edge when the data is in fact autocorrelated. Since some of our actual data are from overlapping observations (5- and 6-week horizons) we are concerned about potential autocorrelation. For this reason, and because the Berkowitz test is the only one of the four tests to jointly test independence and normality, we choose to use the Berkowitz test in this paper. Appendix B: How Volatility Changes Impact Prices The impact of changes in volatility on the risk premium and thence on the price of assets involves several steps. An increase in equity volatility generally leads to an increase in the risk premium though the expected change is model dependent.36 An increase in the risk premium in turn has two effects: an immediate decrease in prices need to provide a greater expected return in the future, assuming the in crease in risk is not accompanied by any information to change the expected level of future cash flows. To make this relationship explicit consider the following simple model: Campbell, Lo and Mackinlay (1997, p. 307) show that under the certain (usual) assumptions regarding the structure of markets, log-normality of asset returns and a representative investor with a power utility function, then the risk premium can be expressed as follows: 1 + Ri ,t +1 log Et = γ σic , 1+ R , + 1 f t where Ri ,t +1 is the return on the risky traded asset, R f ,t +1 is the riskless return, σic is the covariance between the risky asset and consumption and γ is the coefficient of risk aversion, or under the assumed power utility the representative investor’s RRA. In our analysis we use the return on the equity index to proxy for wealth and therefore changes in consumption. Under this assumption, and to acknowledge that variance may change, we replace σic with σit2 . We can also replace 1/(1 + R f ,t +1 ) with the price of the riskless bond Bt and take it outside the expectation. Lastly, we replace (1 + Ri ,t +1 ) with the equivalent in terms of the current and end of period prices of the risky asset, Pi ,t +1 / Pit , where Pi ,t +1 is the cum-dividend value of the risky asset at time t + 1. The previous expression for the risk premium thus becomes: 36 Han (2002) argues that the risk premium is positively related to volatility and negatively related to volatility risk, that is the risk the volatility will change. Depending on the relation between the level of volatility and volatility risk, the total effect of an increase in volatility on the risk premium may be less than expected. P log Bt Et i ,t +1 = γ σit2 . Pit We next examine the effect of an instantaneous change in volatility of the risky asset by differentiating both sides by σit2 . ∂ γ σit2 = γ ∂σi2 = = = Pi ,t +1 ∂ log B E t t ∂σit2 Pit P 1 ∂ Bt 2 Et i ,t +1 P ∂σit Pit Bt Et i ,t +1 Pit 1 1 ∂Et ( Pi ,t +1 ) Et ( Pi ,t +1 ) ∂Pit − . Pit2 ∂σit2 ∂σ it2 Pi ,t +1 Pit Et Pit If we assume that the instantaneous change in volatility does not affect future expected cash flows, then as a first approximation37 ∂Et ( Pi ,t +1 ) ∂σit2 ≈0 and 1 ∂Pit ≈ − γ. Pit ∂σit2 Thus, a change in volatility can be expected to have a coincident change on prices as follows: ∆Pit ≈ − γ ∆σit2 . Pit 37 There is a large literature showing that volatility shocks are persistent, at least over shorter intervals. On the other hand, an examination of the time-series of implied volatilities shows that volatility spikes tend to be of short duration. For purposes of this “back of the envelope” analysis, it is sufficient if the effects of volatility changes on future prices at some horizon are less than that on current prices. This is consistent with our expectations that as volatility increases asset prices fall, and states that under the representative investor power utility assumption the constant of proportionality is the coefficient of risk aversion. Table I Summary Statistics for Samples of Options Cross-Sections Description of option cross-sections after constructing matched sets option prices and interest rates for constructing forecast densities and realizations of the underlying asset for testing. Option cross-sections containing less than 5 strikes with positive time value were eliminated. Option observation dates were selected to have expiries nearest to target horizon with a maximum permissible variation of 3 days for weekly horizons and 4 days for monthly horizons. Horizons out to 2 months include serial contracts expiring at monthly intervals. Beyond 2 month, only quarterly expiries are available. Forecast Horizon 1 week 2 weeks 3 weeks 4 weeks 1 month 5 weeks 6 weeks 2 months 3 months 4 months 5 months 6 months 9 months 1 year FTSE 100 Strikes per Number of Cross-Section CrossSections Min. Mean Max. 99 5 10.8 28 108 5 14.9 43 108 7 18.5 54 108 7 21.7 59 108 9 22.8 56 108 10 24.4 62 108 11 26.5 66 108 7 33.8 81 47 8 35.0 91 36 9 24.4 87 36 9 23.4 91 35 8 21.6 56 34 8 20.5 59 4 11 11.3 12 S&P 500 Number of Strikes per CrossCross-Section Sections Min. Mean Max. 169 5 14.8 64 172 5 20.0 63 178 5 23.7 70 184 5 25.3 77 183 5 26.0 76 184 5 27.0 83 182 5 28.0 81 172 5 27.5 89 74 7 30.4 98 72 5 28.4 77 72 6 24.5 60 67 5 20.3 56 38 5 18.1 30 2 5 6.0 7 Table II Utility Functions and Associated Formulae Functional forms of the two utility functions used to adjust risk neutral density functions, together with the marginal utility and the measure of relative risk aversion: S U ′′ ( ST ) RRA = − T . U ′ ( ST ) Utility Function U ( ST ) U ′( ST ) 1− γ Power ST − 1 1− γ ST Exponential e − γST − γ e− γST −γ RRA γ γST Table III Distribution of Bootstrap Estimates of the Coefficient of Relative Risk Aversion. Coefficients of relative risk aversion (RRA) estimated from 1000 bootstrap samples for each contract type and representative horizon. Each bootstrap sample was of the same size as the original sample (see Table I) and was constructed by sampling with replacement from the original sets of matched option cross-sections/interest rates/ and realizations. Percent of RRA<0 provides a bootstrap test of the hypothesis that RRA = 0, that is that the representative investor in these securities is risk neutral, based on the sampling variation in the data. Standard deviation of Monte Carlo RRA estimates, constructed under the assumption the investors are risk neutral (see section II.D.1), are provided for comparison with the standard deviation of the bootstrap estimates. Point estimates of the RRA using the original sample are provided for comparison with the mean and median of the bootstrap estimates. Minimum Mean Median Maximum RRA<0 Standard Deviation Standard Deviation Point estimates FTSE 100 (3-week) S&P 500 (5-week) Exponential Exponential Power Power Utility Utility Utility Utility (Mean) (Mean) Bootstrap RRA estimates -1.25 -1.90 -1.34 -1.17 5.02 5.11 3.72 3.95 5.05 5.08 3.68 3.95 11.76 11.03 8.17 8.56 0.4% 0.4% 0.5% 0.5% 1.99 1.92 1.37 1.41 Monte Carlo RRA estimates 2.24 2.17 1.34 1.28 Original sample RRA estimates 5.10 4.99 3.53 3.26 Table IV Berkowitz Statistic P-Values for Risk-Neutral and Power- and Exponential-Utility-Adjusted PDFs This table presents the results of tests of the ability of risk neutral and subjective PDFs to forecast the future distribution of the prices of the underlying asset. Tests use a modified Berkowitz test. Power and exponential utility adjusted PDFs are constructed by adjusting the risk-neutral PDF using the appropriate utility function and equation (1). The utility function risk aversion parameters were selected to maximize the Berkowitz statistic LR3. The reported LR3 value is the p-value of the Berkowitz likelihood ratio test for i.i.d. normality of the inverse-normal transformed inverse-probability transforms of the ˆ . The power- and ˆ ˆ 2 ,ρ) realizations (equations (2) and (3)): LR 3 = −2 L(0,1, 0) − L(µ,σ exponential-utility p-values have been adjusted for the bias resulting from maximizing the LR3 p-value over values of the risk aversion coefficient γ. Adjustments are based on Monte Carlo simulations (see section II.D.1). The LR1 statistic is the p-value of the Berkowitz likelihood ratio test for independence of the same transformed data: LR1 = −2 L(µˆ ,σˆ 2 , 0) − L(µˆ ,σˆ 2 ,ρˆ ) . Rejection of the test for independence (low LR1 values) means that rejection of the “good” forecast null hypothesis (low LR3 values) may be due serial correlation rather than poor density forecasts. Forecast Horizon 1 week 2 weeks 3 weeks 4 weeks 5 weeks 6 weeks PDF Risk-neutral Power Exponential Risk-neutral Power Exponential Risk-neutral Power Exponential Risk-neutral Power Exponential Risk-neutral Power Exponential Risk-neutral Power Exponential N 99 108 108 108 108 108 FTSE 100 LR3 LR1 0.233 0.578 0.740 0.506 0.643 0.368 0.006 0.581 0.009 0.442 0.010 0.324 0.044 0.596 0.207 0.920 0.269 0.854 0.035 0.466 0.114 0.713 0.143 0.863 0.021 0.659 0.041 0.447 0.046 0.319 0.018 0.057 0.018 0.033 0.016 0.025 N 168 171 177 183 183 181 S&P 500 LR3 0.003 0.038 0.159 0.003 0.010 0.033 0.001 0.096 0.274 0.022 0.204 0.326 0.007 0.051 0.048 0.000 0.003 0.003 LR1 0.295 0.319 0.763 0.845 0.973 0.486 0.441 0.591 0.727 0.678 0.916 0.712 0.258 0.184 0.113 0.019 0.009 0.002 Table V Estimates of the Risk Aversion Parameter γ Values of the risk aversion parameter γ obtained by maximizing the forecast ability of the adjusted PDFs measured using the Berkowitz LR3 statistic. *,**, and *** indicate statistically that the values are significantly different from zero at 10%, 5%, and 1% levels of significance respectively, based on Monte Carlo simulations described in the section II.D.1. Forecast Horizon 1 week 2 weeks 3 weeks 4 weeks 5 weeks 6 weeks FTSE 100 Power Exponential Utility Utility 7.91** 1.52** 4.44* 1.00* 5.10** 1.11** 4.05** 0.91** 3.04** 0.66** 1.97 0.37 S&P 500 Power Exponential Utility Utility 9.52*** 15.97*** 5.38** 8.44** 6.85*** 10.38*** 4.08*** 6.33** 3.53*** 5.22*** 3.37*** 4.36** Table VI Measures of Relative Risk Aversion Implied by PDFs Adjusted Using Power and Exponential Utility Functions Values for the representative investors relative risk aversion (RRA) obtained by maximizing the forecast ability of the adjusted PDFs measured using the Berkowitz LR3 statistic. For the power utility the RRA is γ, while for the exponential utility the RRA is γSt and therefore varies with the level of the index. The high and low at-the-money (ATM) volatility results were obtained by first dividing the sample in two equal halves based on the mean of the implied volatility of the two nearest-the-money strikes for each cross-section, and then re-estimating the Berkowitz LR3-maximizing value of γ for each sub-sample. Forecast Horizon Power Utility 1 week 2 weeks 3 weeks 4 weeks 5 weeks 6 weeks 7.91 4.44 5.10 4.05 3.04 1.97 1 week 2 weeks 3 weeks 4 weeks 5 weeks 6 weeks 4.84 1.90 3.20 2.64 1.48 0.95 1 week 2 weeks 3 weeks 4 weeks 5 weeks 6 weeks 12.76 7.90 8.25 6.37 5.62 4.03 FTSE 100 S&P 500 Exponential Utility Power Exponential Utility Range Mean Median Utility Range Mean Median All Observations 3.60–10.09 7.00 6.89 9.52 3.709–23.972 10.56 7.39 2.37–6.64 4.47 4.05 5.38 1.959–12.660 5.54 3.91 2.64–7.41 4.99 4.52 6.85 1.633–15.574 6.64 4.75 2.17–6.08 4.09 3.71 4.08 0.947–9.502 3.96 2.88 1.57–4.39 2.96 2.68 3.53 0.777–7.826 3.26 2.37 0.89–2.50 1.68 1.52 3.37 0.650–6.543 2.74 1.98 High ATM Implied Volatilities Observations 2.22–6.19 4.96 5.45 8.35 2.88–18.63 9.34 8.70 1.22–3.42 2.72 3.00 1.53 0.61–3.96 2.07 2.10 1.72–4.50 3.59 3.95 5.57 1.80–11.59 6.31 6.94 1.26–3.54 2.84 3.14 2.32 0.46–4.63 2.52 2.80 0.83–2.18 1.70 1.91 2.41 0.46–4.58 2.46 2.74 0.62–1.63 1.29 1.45 2.33 0.41–4.14 2.24 2.48 Low ATM Implied Volatilities Observations 7.50–20.28 11.88 11.19 11.14 5.58–32.79 12.84 10.42 5.36–14.21 8.01 7.61 11.03 5.46–32.68 11.62 10.07 5.26–14.44 8.09 7.73 8.92 3.06–28.80 8.94 8.70 4.37–11.59 6.43 6.14 6.90 2.56–12.39 7.04 7.14 3.50–9.61 5.58 5.31 5.66 2.04–14.87 6.06 6.10 2.41–6.62 3.77 3.64 5.70 2.06–15.04 6.13 6.17 50 Table VII Coefficient of Relative Risk Aversion Estimates from Previous Studies38 Study Arrow (1971) Friend and Blume (1975) Hansen and Singleton (1982, 1984) Mehra and Prescott (1985) Epstein and Zin (1991) Ferson and Constantinides (1991) Cochrane and Hansen (1992) Jorion and Giovannini (1993) Normandin and St-Amour (1996) Ait-Sahalia and Lo (2000)39 Guo and Whitelaw (2001) 38 CRRA Range 1 2 0–1 55 0.4–1.4 0–12 40–50 5.4–11.9 <3 12.7 3.52 This table is an updated version of Ait-Sahalia and Lo (2000) Table 5. The CRRA value of 12.7 reported in Ait-Sahalia and Lo (200) is an average value, however they informally reject CRRA in favor of a broadly U-shaped relative risk aversion function that between 0 and 60. 39 51 Table VIII Relation Between Stock Returns and Changes in Options-Implied Variance Slope coefficients obtained by regressing the daily percentages in the S&P 500 index level against daily changes in the VIX implied volatility index level. The high- and lowvolatility sub-samples were constructed by dividing the sample into two halves based on the level of the VIX index. The slope coefficients provide an alternative indication of changes in the representative agents risk aversion, γ, during periods of high and low volatility: dPt ≈ − γ dVar (rt ). Pt To examine the effects of outliers, the largest absolute changes in volatility were trimmed from each sample. All coefficients are significant at the 1 percent level. All Observations 0.40 High Volatility 0.38 Low Volatility 4.41 5 0.41 0.35 1.26 10 0.53 0.48 0.94 20 0.55 0.49 0.87 50 0.52 0.47 0.76 100 0.49 0.44 0.71 Delete n largest volatility changes: n = Complete Sample 52 Table A1 Monte Carlo Tests of Various Methods of Testing Density Forecasts Monte Carlo tests consisted of generating 10,000 samples of indicated size (N) of U(0,1) random numbers. Tests included both uncorrelated and correlated samples. Numbers reported in the table are the frequency with which the Berkowitz, Chi-squared, Kolmogorov-Smirnov, and Kupier statistics, computed for each Monte Carlo sample, exceeded their theoretical 90- and 99-percent values. Thus, in small samples the Chisquared test fails to reject as frequently as it should (0.839 & 0.984 vice 0.900 & 0.990). Tests using correlated Monte Carlo data provide a test of the robustness of these tests under conditions when the data are U(0,1) but not i.i.d., as may occur with overlapping observations. ρ N 50 0 100 (i.i.d.) 200 10,000 50 100 0.1 200 10,000 50 100 0.2 200 10,000 0 (i.i.d.) 0.1 0.2 50 100 200 10,000 50 100 200 10,000 50 100 200 10,000 ChiBerkowitz LR3 LR1 square α = 0.90 0.901 0.905 0.839 0.896 0.903 0.846 0.898 0.904 0.858 0.898 0.899 0.889 0.918 0.915 0.850 0.933 0.931 0.858 0.961 0.957 0.858 1.000 1.000 0.892 0.961 0.951 0.862 0.985 0.984 0.867 0.999 0.999 0.871 1.000 1.000 0.896 α = 0.99 0.990 0.989 0.984 0.991 0.990 0.981 0.991 0.991 0.984 0.991 0.990 0.987 0.992 0.992 0.985 0.993 0.994 0.984 0.995 0.996 0.984 1.000 1.000 0.988 0.997 0.996 0.985 0.999 0.999 0.982 1.000 1.000 0.984 1.000 1.000 0.990 53 KolmogorovKupier Smirnov 0.890 0.893 0.893 0.901 0.896 0.899 0.899 0.906 0.904 0.903 0.908 0.920 0.487 0.621 0.718 0.881 0.503 0.639 0.725 0.885 0.521 0.657 0.736 0.894 0.987 0.988 0.989 0.989 0.989 0.987 0.987 0.989 0.988 0.989 0.990 0.992 0.744 0.857 0.919 0.985 0.751 0.864 0.917 0.984 0.764 0.865 0.927 0.987 Descriptions of Figures Figure 1. Distribution of estimated p-values and γs from Monte Carlo simulations using risk-neutral PDFs. Underlying data are the actual FTSE 100 and S&P 500 riskneutral PDFs and pseudo-realizations drawn from those PDFs—implying the true γ = 0. The p-values and γs were obtained by maximizing the Berkowitz LR3 statistic over values of γ to obtain the best density forecast performance. Simulations were repeated 1,000 times for each contract/horizon. The box portion of each Tukey plot encompasses the inter-quartile range of the estimates (p-values or γs ) derived from the Monte Carlo simulations. The lines dividing the box are the mean (dotted) and median (solid) respectively. The “whiskers” are the 10th and 90th percentiles of the distributions and the end points correspond to the 5th and 95th percentiles. Figure 2. Cross-validation results. The time-series of relative risk aversion (RRA) estimates was obtained by cross-validation using the 4-week S&P 500 and 3-week FTSE 100 options contracts. Cross-validation consists of removing a single observation (option cross-section and realization pair), estimating the RRA, replacing the omitted observation and the removing the next one, and so forth. The time-index of each RRA corresponds to the observation date of the omitted observation. For each cross-validation subset, the RRA was estimated by maximizing the Berkowitz LR3 statistic over values of γ to obtain the best density forecast performance, and then converting the risk-aversion parameter γ into an RRA. The mean RRA is plotted for the exponential utility adjustment. Figure 3. Plot of risk premia implied by risk-neutral and subjective PDFs. The risk premia are measured using the 4-week ahead S&P 500 risk-neutral and subjective PDFs. The subjective PDFs were constructed using the relevant utility function and equation (1), and then maximizing the Berkowitz LR3 statistic over values of γ to obtain the best density forecast performance. The risk premium observed each observation date is the difference between the means of the subjective and risk neutral PDFs, normalized by the mean of the risk neutral PDF. The data are plotted twice on different vertical scales for convenience. Figure 4. Comparison of standard deviations and skewness coefficients from riskneutral and subjective PDFs. Underlying data are 183 risk-neutral and subjective PDFs for the S&P500 1-month options. The subjective PDFs were constructed using the relevant utility function and equation (1), and then maximizing the Berkowitz LR3 statistic over values of γ to obtain the best density forecast performance. The standard deviation and skewness of the resulting subjective PDFs are then plotted against the corresponding statistics for the unadjusted risk-neutral PDFs. (Figures follow) 54 Descriptions of Figures Figure 1. Distribution of estimated p-values and γs from Monte Carlo simulations using risk-neutral PDFs. Underlying data are the actual FTSE 100 and S&P 500 riskneutral PDFs and pseudo-realizations drawn from those PDFs—implying the true γ = 0. The p-values and γs were obtained by maximizing the Berkowitz LR3 statistic over values of γ to obtain the best density forecast performance. Simulations were repeated 1,000 times for each contract/horizon. The box portion of each Tukey plot encompasses the inter-quartile range of the estimates (p-values or γs ) derived from the Monte Carlo simulations. The lines dividing the box are the mean (dotted) and median (solid) respectively. The “whiskers” are the 10th and 90th percentiles of the distributions and the end points correspond to the 5th and 95th percentiles. Figure 2. Cross-validation results. The time-series of relative risk aversion (RRA) estimates was obtained by cross-validation using the 4-week S&P 500 and 3-week FTSE 100 options contracts. Cross-validation consists of removing a single observation (option cross-section and realization pair), estimating the RRA, replacing the omitted observation and the removing the next one, and so forth. The time-index of each RRA corresponds to the observation date of the omitted observation. For each cross-validation subset, the RRA was estimated by maximizing the Berkowitz LR3 statistic over values of γ to obtain the best density forecast performance, and then converting the risk-aversion parameter γ into an RRA. The mean RRA is plotted for the exponential utility adjustment. Figure 3. Plot of risk premia implied by risk-neutral and subjective PDFs. The risk premia are measured using the 4-week ahead S&P 500 risk-neutral and subjective PDFs. The subjective PDFs were constructed using the relevant utility function and equation (1), and then maximizing the Berkowitz LR3 statistic over values of γ to obtain the best density forecast performance. The risk premium observed each observation date is the difference between the means of the subjective and risk neutral PDFs, normalized by the mean of the risk neutral PDF. The data are plotted twice on different vertical scales for convenience. Figure 4. Comparison of standard deviations and skewness coefficients from riskneutral and subjective PDFs. Underlying data are 183 risk-neutral and subjective PDFs for the S&P500 1-month options. The subjective PDFs were constructed using the relevant utility function and equation (1), and then maximizing the Berkowitz LR3 statistic over values of γ to obtain the best density forecast performance. The standard deviation and skewness of the resulting subjective PDFs are then plotted against the corresponding statistics for the unadjusted risk-neutral PDFs. (Figures follow) P-Values FTSE 100 1.0 1.0 0.8 0.8 0.6 0.6 Power-Utility Adjusted 0.4 0.4 0.2 0.2 0.0 0.0 P-Values 1-wk S&P 500 Exponential-Utility Adjusted 2-wk 3-wk 4wk 5-wk 6-wk 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.0 1-wk 2-wk 3-wk 4wk 5-wk 6-wk 1-wk 2-wk 3-wk 4wk 5-wk 6-wk 1-wk 2-wk 3-wk 4wk 5-wk 6-wk 1-wk 2-wk 3-wk 4wk 5-wk 6-wk 0.0 1-wk 2-wk 3-wk 4wk 5-wk 6-wk 2 8 1 4 2 γs FTSE 100 6 0 0 -2 -4 -1 -6 -8 -2 1-wk 2-wk 3-wk 4wk 5-wk 6-wk 6 8 6 4 2 γs S&P 500 4 2 0 0 -2 -2 -4 -4 -6 -8 -6 1-wk 2-wk 3-wk 4wk 5-wk Contract Horizon 6-wk Contract Horizon Figure 1. Distribution of estimated p-values and from Monte Carlo simulations using risk-neutral PDFs. S&P 500 Options Relative Risk Aversion Power Utility Exponential Utility 4.7 4.7 4.5 4.5 4.3 4.3 4.1 4.1 3.9 3.9 3.7 3.7 3.5 1/1/83 1/1/86 1/1/89 1/1/92 1/1/95 1/1/98 1/1/01 3.5 1/1/83 1/1/86 1/1/89 1/1/92 1/1/95 1/1/98 1/1/01 FTSE 100 Options Relative Risk Aversion Power Utility Exponential Utility 5.8 5.8 5.6 5.6 5.4 5.4 5.2 5.2 5.0 5.0 4.8 4.8 4.6 4.6 4.4 4.4 1/1/92 1/1/95 1/1/98 Expiry Date Figure 2: Cross-Validation Results 1/1/01 1/1/92 1/1/95 1/1/98 Expiry Date 1/1/01 Annualized Risk Premium (log10) 10 Power-Utility Function Exponential-Utility Function 1 0.1 0.01 1984 1986 1988 1990 1992 1994 1996 1998 2000 0.14 Power-Utility Function Exponential-Utility Function 1-Month Risk Premium 0.12 0.10 0.08 0.06 0.04 0.02 0.00 1984 1986 1988 1990 1992 1994 1996 1998 PDF Observation Date Figure 3. Plot of risk premia implied by risk-neutral and subjective PDFs. 2000 (percent of underlying) Exponential-Utility-Adjusted PDF 0.20 0.20 0.18 0.18 0.16 0.16 0.14 0.14 0.12 0.12 0.10 0.10 0.08 0.08 0.06 0.06 0.04 0.04 0.02 0.02 Subjective PDF Skewness Coefficient Subjective PDF Standard Deviation Power-Utility-Adjusted PDF 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0.02 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 Risk-Neutral PDF Standard Deviation Risk-Neutral PDF Standard Deviation (percent of underlying) (percent of underlying) 1.0 1.0 0.5 0.5 0.0 0.0 -0.5 -0.5 -1.0 -1.0 -1.0 -0.5 0.0 0.5 Risk-Neutral PDF Skewness Coefficient 1.0 -1.0 -0.5 0.0 0.18 0.5 Risk-Neutral PDF Skewness Coefficient Figure 4. Comparison of standard deviations and skewness coefficients from risk-neutral and subjective PDFs. 0.20 1.0