Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
14 views101 pages

Stock Return Predictability Pockets

The document analyzes short-term stock market return predictability over time. It finds return predictability is concentrated in localized 'pockets' separated by long periods with little predictability. Using time-varying models, it identifies pockets lasting 4-24 months where predictors like the term spread significantly predict returns out-of-sample. Outside pockets, predictability is essentially undetectable despite more data. Portfolios timing based on pocket predictions generate annual alphas of 2-6% even after accounting for risks and costs.

Uploaded by

Anand Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views101 pages

Stock Return Predictability Pockets

The document analyzes short-term stock market return predictability over time. It finds return predictability is concentrated in localized 'pockets' separated by long periods with little predictability. Using time-varying models, it identifies pockets lasting 4-24 months where predictors like the term spread significantly predict returns out-of-sample. Outside pockets, predictability is essentially undetectable despite more data. Portfolios timing based on pocket predictions generate annual alphas of 2-6% even after accounting for risks and costs.

Uploaded by

Anand Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

Pockets of Predictability∗

Leland E. Farmer Lawrence Schmidt


University of Virginia Massachusetts Institute of Technology
Allan Timmermann
University of California, San Diego

November 23, 2021

Abstract

For many benchmark predictor variables, short-horizon return predictability in the U.S. stock
market is local in time as short periods with significant predictability (‘pockets’) are interspersed
with long periods with little or no evidence of return predictability. We document this result
empirically using a flexible time-varying parameter model which estimates predictive coefficients
as a nonparametric function of time and explore possible explanations of this finding, including
time-varying risk-premia for which we only find limited support. Conversely, pockets of return
predictability are consistent with a sticky expectations model in which investors only slowly
update their beliefs about a persistent component in the cash flow process.
Key words: Out-of-sample return predictability; time-varying expected returns; sticky ex-
pectations; affine asset pricing models.


We acknowledge constructive and insightful comments from the Editor, Stefan Nagel, an Associate Editor and
two anonymous referees. We also received helpful comments from Frank Diebold, Xavier Gabaix, Bradley Paye,
Hashem Pesaran and seminar participants at Penn, Boston University, NC State University, UCSD, University of
Warwick, University of Virginia, University of British Columbia, Edhec, the 2018 SoFiE meetings in Lugano, the
2018 SITE meetings at Stanford, and the 2018 IAAE conference in Montreal. We thank Yury Olshanskiy and Victor
Sellemi for outstanding research assistance.
1 Introduction
Researchers have long been interested in the extent to which stock returns are predictable. Over
the last several decades, time-varying risk premia have widely been suggested as a key source of
fluctuations in stock prices, and many workhorse macro-finance models seek to exogenously generate
large fluctuations in discount rates on the aggregate stock market. Both welfare calculations and
normative predictions about optimal investment strategies are often quite different in the presence of
return predictability. At the same time, these findings have been met with some skepticism given a
number of studies which find empirical evidence that return predictability is highly unstable, varying
greatly across time and across different markets and being difficult to exploit out-of-sample.1
Existing evidence on return predictability has mostly been established using linear, constant-
coefficient regressions which pool information across long historical spans of time and so are designed
to establish whether stock returns are predictable “on average,” i.e., across potentially very different
economic states. Inference on the resulting coefficients may yield misleading and unstable results
if, in fact, return predictability shifts over time. To address this possibility, our paper adopts a new
estimation strategy capable of identifying patterns in return predictability that are “local” in time.
Specifically, we estimate predictive regressions with time-varying parameters based on one-sided
kernel regressions that allow the coefficients to follow a smooth, nonparametric function of calendar
time. Unlike alternative approaches which impose tight parametric restrictions on how predictive
coefficients evolve over time, we do not need to take a stand on the return generating process.2
Next, we use a local trend estimation approach to identify periods in time where forecasts from the
local kernel regressions were more accurate than those from a prevailing mean benchmark model.
Following studies such as Pesaran and Timmermann (1995) and Welch and Goyal (2008) which
emphasize the need for out-of-sample return predictability, our approach is fully out-of-sample,
avoiding the use of any future data, and we let the data determine both how large predictability is
at a given point in time and how long it lasts.
Using this approach, we present new empirical evidence that short-horizon return predictability
is quite concentrated, or local in time, and tends to fall in certain (contiguous) “pockets.” For
example, using the term spread as a predictor variable over a sixty three year period, our approach
identifies in real time seven pockets whose duration lasts between four months and two years so
that, in total, fifteen percent of the sample is spent inside pockets with return predictability. As
another illustration of the extent to which short-horizon predictability is concentrated in time,
1
For early studies, see, e.g., Campbell (1987), Fama and French (1988, 1989), Keim and Stambaugh (1986), and
Pesaran and Timmermann (1995). Lettau and Ludvigson (2010) and Rapach and Zhou (2013) review the extensive
literature on return predictability. Paye and Timmermann (2006), Rapach and Wohar (2006), and Chen and Hong
(2012) find evidence of parameter instability for stock market return prediction models.
2
Several studies adopt parametric assumptions about time variation in the return generating process. For example,
Henkel, Martin and Nardari (2011) use regime switching models to capture changes in stock return predictability,
while Dangl and Halling (2012) and Johannes, Korteweg and Polson (2014) use time-varying parameter models to
track predictability in stock returns. Like any other non-parametric approach, we do have to pick a bandwidth
parameter, but our findings are robust to choices of this parameter across a wide range of values.

1
we estimate univariate regression models with constant coefficients with our predictors over two
subsamples – those observations associated with our ex-ante identified “pockets” and all other
periods. We find strong evidence of in-pocket return predictability and essentially no statistically
significant evidence of predictability outside of pockets, despite the fact that the vast majority of
our sample falls outside of these pocket periods, i.e., when we would have more statistical power
to detect predictability.
To quantify the amount of local return predictability, and to calibrate what amount of pre-
dictability to expect under conventional asset pricing models, we compute Clark and West (2007)
statistics that compare out-of-sample mean squared prediction errors from the local kernel regres-
sions to those from a prevailing mean benchmark. Next, we conduct a battery of simulation exercises
that assess the extent to which we can statistically reject the null hypotheses of no predictability
or predictability associated with a constant coefficient model, respectively. Mirroring the above
analysis, we conduct these tests for the full sample as well as for ex-ante identified in-pocket and
out-of-pocket subperiods. For the full sample, i.e., “on average”, we find no statistical evidence that
our local kernel regressions outperform the prevailing mean across the univariate or multivariate
models that we consider. Results deteriorate substantially outside of pockets; the time-varying co-
efficient models always underperform prevailing mean forecasts, sometimes by a significant margin.
These findings echo a number of empirical results from the literature (e.g. Welch and Goyal, 2008)
indicating the difficulty of detecting out-of-sample return predictability, a phenomenon which is
exacerbated in our context given that our local regressions are subject to larger estimation error
relative to standard approaches.
However, the picture changes substantially inside ex-ante identified pockets, inside which we
find strong evidence of return predictability across a range of univariate and multivariate models.
Consistent with findings in the literature, these results generally improve further if we impose
economically-motivated restrictions on our expected return forecasts or incorporate multivariate
information, e.g., by combining forecasts from univariate models.3
To quantify the economic value of our ability to detect significant out-of-sample return pre-
dictability, we construct managed portfolios which use our ex-ante expected excess return forecasts
to dynamically rebalance a portfolio comprising the market and a risk-free asset. While such a
strategy earns conditional CAPM alphas of zero by construction, it generates sizable unconditional
CAPM alphas; for example, our two best forecast combination-based strategies deliver annual-
ized CAPM alphas (t-statistics) of 6.4% (6.1) and 6.1% (5.7), respectively, while univariate return
prediction models generate alphas in the range of 2-4% with highly significant t-statistics. These
results are robust to controlling for volatility and momentum factors and hold net of proportional
transaction costs as high as 10 basis points.
In an additional sequence of tests, we repeat these analyses with the Fama-French SMB and
3
See, e.g., Campbell and Thompson (2008), Kelly and Pruitt (2013), Pettenuzzo, Timmermann and Valkanov
(2014), Rapach, Strauss and Zhou (2010), and Timmermann (2006).

2
HML factors. In both cases, we find similar, and often even stronger, results. Whereas all pre-
dictors underperform out of pockets, we detect substantial statistical evidence for out-of-sample
predictability inside of pockets. Likewise, our market timing exercises deliver substantial and eco-
nomically meaningful gains in risk-adjusted performance.
We conduct a battery of additional tests to ensure the robustness of our main results. In
particular, we vary the length of the windows used to estimate the parameters of the local kernel
regressions and identify pockets, separately consider null hypotheses with zero or constant slope
coefficients on the state variables, and examine an alternative “local prevailing mean” benchmark
that accounts for possible return momentum, and examine the effect of Stambaugh (1999) bias. In
all cases, we corroborate that our empirical findings are not sensitive to the setup of our baseline
analysis. Moreover, to make our findings more directly comparable to the extant literature, we
apply our real-time, local predictability approach to monthly stock returns. Again, we find that
our local out-of-sample return predictions are significantly more accurate than the prevailing mean
benchmark inside ex-ante identified pockets while the reverse holds outside pockets and we show
that our approach can lead to economically important improvements over existing methods from
the return predictability literature.4
With these new empirical results in hand, we next explore which economic mechanisms are
capable of generating pockets of local return predictability. We start by conducting return sim-
ulations from four workhorse rational expectations asset pricing models which represent a wide
range of mechanisms and are representative of the dynamics of returns and state variables implied
by models exhibiting time-varying risk premia. These include the long-run risk model of Bansal
and Yaron (2004), the habit formation model of Campbell and Cochrane (1999), the heterogeneous
agent model of Gârleanu and Panageas (2015), and the rare disaster model of Wachter (2013).
All of these models are calibrated to generate dynamics that are consistent with the data, in the
sense that increases in risk premia tend to correspond with slow-moving changes in discount rates.
Accordingly, the state variables governing return predictability are highly persistent, signal-to-
noise ratios for predictive regressions are extremely low, and innovations to predictors such as the
dividend-price ratio have very strong negative correlations with realized returns. As such, posi-
tive shocks to the discount rate, especially ones large enough to be detectable, will generate large
negative realized returns which at least temporarily lead to exactly the wrong inference about the
predictive relationship (Stambaugh, 1999). This makes it challenging to detect state-dependent
return predictability in such models.
Consistent with this intuition, we find that none of these workhorse models are capable of
4
Henkel, Martin and Nardari (2011), Dangl and Halling (2012), and Rapach, Strauss and Zhou (2010) argue that
return predictability is largely confined to recession periods. In unreported results, we find that the link between
economic recessions and our return predictability pockets is rather weak and that the stage of the economic cycle
only explains a very small part of the time variation in expected returns that we document. Movements in an
investor sentiment indicator (Baker and Wurgler, 2006, 2007) or changes in broker-dealer leverage (Adrian, Etula
and Muir, 2014) tracking availability of arbitrage capital, also do not correlate strongly with the time variation in
return predictability that we document.

3
matching the empirically observed out-of-sample predictive accuracy associated with in-pocket pe-
riods.5 Turning to the economic performance (market-timing) results, the average alpha estimates
are usually close to zero and statistically insignificant. Both results indicate that these benchmark
asset pricing models fail to generate short-lived pockets of substantial predictability which is de-
tectable via our local kernel regressions, which suggests that the very features which allow the asset
pricing models to replicate a number of stylized facts about equity returns in the data combine
to create substantial potential for estimation error to dominate the small amount of true ex-ante
predictability generated by the time varying risk premia in the model.
Motivated by a recent and rapidly growing literature at the intersection of macroeconomics and
finance (see, e.g. Coibion and Gorodnichenko, 2015; Bouchaud et al., 2019), we finally consider an
alternative explanation for our observed results. Specifically, we consider the potential implications
for high-frequency return predictability of a model in which agents have sticky expectations, under-
reacting to news in a manner consistent with both theoretical work and a large body of empirical
evidence.6 We propose a stylized asset pricing model in which agents price cash flows according to
a loglinearized dynamic dividend discount model in which prices equal the sum of expected cash
flows discounted by time-varying subjective discount rates. However, we deviate from the rational
expectations benchmark by assuming that agents’ beliefs about future cash flows adjust sluggishly
to new information relative to the true data generating process. In other words, whereas agents
believe that expected excess returns are governed by a set of slow moving state variables similar
to the workhorse models discussed above, expected returns feature an additional, high-frequency
component under the objective probability distribution. This extra term captures the difference
between agents’ subjective forecasts of expected cash flow growth rates and the true state variable
governing expected cash flow growth rates. The presence of this term implies that prices exhibit
“local factor momentum”: recent changes in valuation ratios signal the likelihood that future val-
uations will continue to drift upwards, a pattern which is counter to the long-run mean reversion
in prices which is expected from time-varying discount rates.
We calibrate our model to match a number of observable asset pricing moments, then perform
a number of simulation exercises to assess the extent to which such a model generates pockets of
predictability. Importantly, the degree of stickiness of beliefs is disciplined by external estimates
5
Matching the full-sample or out-of-pocket results is less challenging, in part because the out-of-sample accuracy
of our predictive return regressions is fairly weak overall.
6
Early theoretical papers on sluggish adjustments in expectations include (Mankiw and Reis, 2002; Woodford,
2003; Sims, 2003). A number of empirical papers present evidence on underreaction to aggregate news at short
horizons. See, e.g., Moskowitz and Grinblatt (1999), Hong, Lim and Stein (2000), Hong, Torous and Valkanov
(2007), Hou (2007), and Bouchaud et al. (2019), who present evidence of slow diffusion of stock- or industry-specific
information in stock markets. Katz, Lustig and Nielsen (2017) also find evidence of underreaction of asset prices to
fluctuations in inflation rates across countries. Turning to fixed income markets, d’Arienzo (2020) and Wang (2020)
both present evidence that yields underreact to macro news at short horizons, but overreact at longer horizons, which
relates to a puzzle identified by Giglio and Kelly (2018). See also Bordalo et al. (2020) and Angeletos, Huo and
Sastry (2021) for additional empirical evidence and discussion of the related empirical and theoretical literature on
this subject.

4
based on analysts’ forecasts of macroeconomic quantities from Coibion and Gorodnichenko (2015).
We then compare simulations from our sticky expectations benchmark with analogous data sim-
ulated from a rational expectations model with the same cash flow and subjective discount rate
dynamics. Once again, our local kernel regressions are unable to detect statistically or economically
significant out-of-sample return predictability in the specifications which impose rational expecta-
tions. However, despite the fact that local predictability is not targeted, we find that the sticky
expectations model can replicate the degree of out-of-sample return predictability observed in the
data, a pattern that is robust across predictors and econometric specifications.
In our sticky expectations model, one source of return predictability is the “belief discrepancy”
between agents’ cash flow expectations versus the “correct” forecasts conditional on the true data
generating process. The presence of such a belief distortion acts as an important additional chan-
nel through which expected returns are forecastable by the econometrician in such models.7 We
conclude by providing direct evidence linking our expected return forecasts with data on forecast
errors of professional forecasters. Consistent with predictions of the theory, above-average forecasts
from all of our time-varying coefficient models predict positive forecast errors in the future. In other
words, sluggish updating of agents’ beliefs implies that returns are predictable because future cash
flow “shocks”–deviations between realizations and agents’ subjective expectations–are forecastable.
Our local return forecasts capture a nontrivial fraction of this variation.8
The rest of the paper proceeds as follows. Section 2 discusses conventional approaches to mod-
eling return predictability and introduces our nonparametric methodology for identifying pockets
with local return predictability. Section 3 introduces our daily data and presents empirical evidence
on return predictability pockets. This section also uses simulations to address whether the pockets
could be generated spuriously as a result of the repeated use of correlated tests for local return
predictability. Section 4 evaluates the statistical and economic performance of our nonparametric
return forecasts and conducts a number of robustness checks. Section 5 considers whether a suite
of workhorse asset pricing models with time-varying risk premia are capable of generating return
predictability pockets. Section 6 presents our framework with sticky expectations, illustrates that
a calibrated model can match a number of empirical results, then presents empirical evidence link-
ing our ex-ante expected return forecasts with future macroeconomic forecast errors. Section 7
concludes. A set of appendices contain additional technical material and empirical results.
7
The effect of such a wedge on local return predictability depends on the sequence of recent shocks to the cash
flow, risk premium and risk-free rate processes in the model. Because the sequence of shocks is never exactly the
same as has occurred previously and expectations are sticky, pockets of return predictability will never be “learned
away” by agents. This is in contrast to papers such as Green, Hand and Soliman (2011) and McLean and Pontiff
(2016) which imply that patterns of return predictability that can be exploited for economic gains will vanish once
discovered by agents. See also Schwert (2003) and Timmermann (2008).
8
See also Bouchaud et al. (2019) and Gomez Cram (2021) for related evidence using forecast errors aggregated
from equity analysts’ earnings forecasts. Gomez Cram (2021) introduces a sticky expectations model that relates
return predictability to turning points of the business cycle. The mechanism of his model, along with his empirical
results, are quite different from ours since we rely on nonparametric methods and find only a weak association between
business cycle variation and pockets with local return predictability.

5
2 Prediction Models and Estimation Methodology
This section briefly discusses the conventional constant-coefficient return prediction model before
introducing the non-parametric regression methodology that we use to identify time variation in
return predictability.

2.1 Conventional Return Predictability Model


A large empirical literature summarized in Welch and Goyal (2008) and Rapach and Zhou (2013)
studies predictability of stock returns using linear, constant-coefficient models of the form

rs,t+1 − rf,t+1 = x0t β + εt+1 . (1)

Here rs,t+1 is the stock market return and rf,t+1 is the risk-free rate, both measured in period t+1, so
that rt+1 ≡ rs,t+1 −rf,t+1 measures the excess return. xt is a (d×1) vector of covariates (predictors)
which could include a constant, and εt+1 is an unobservable disturbance with E [εt+1 |xt ] = 0.
In Appendix A, we show that the specification in (1) is consistent with a broad class of affine
asset pricing models exhibiting time variation in either the quantity or the price of risk. For exam-
ple, (1) holds approximately in a representative agent model where agents have Epstein and Zin
(1989) preferences when aggregate consumption growth is an affine function of state variables that
follow a stationary vector autoregressive process.9 This setting includes many of the specifications
considered in the literature on consumption-based asset pricing models with long-run risks and
rare disasters and also holds under incomplete markets with state-dependent higher moments of
uninsurable idiosyncratic shocks.10 As we further demonstrate in Appendix A, subject to certain
restrictions, (1) can also allow for time-variation in the price of risk and, thus, nests many models
which have been used to characterize the term structure of interest rates as well as the log-linearized
stochastic discount factor habit formation model of Campbell and Cochrane (1999).
Despite its theoretical appeal, the empirical validity of the assumption of constant regression
coefficients in the linear return regression (1) has been challenged in studies such as Paye and
Timmermann (2006), Rapach and Wohar (2006), Chen and Hong (2012), Dangl and Halling (2012),
and Johannes, Korteweg and Polson (2014), all of which find strong evidence that this assumption
is empirically rejected for U.S. stock returns using standard predictor variables. We therefore next
consider an econometric framework that can accommodate unstable coefficients.
9
See, e.g., Bansal and Yaron (2004), Hansen, Heaton and Li (2008), Eraker and Shaliastovich (2008) and Drechsler
and Yaron (2011).
10
See, e.g., Constantinides and Duffie (1996), Constantinides and Ghosh (2017), Schmidt (2016), and Herskovic
et al. (2016).

6
2.2 A Flexible Time-Varying Parameter Model
We generalize (1) to allow for time-varying return predictability of the form:

rt+1 = x0t β t + εt+1 , (2)

where the regression coefficients β t are now subscripted with t to indicate that they are functions
of time as a means of allowing for time-varying return predictability. We also allow for general
forms of conditional heteroskedasticity σ 2t ≡ E ε2t |xt = σ 2 (xt ). To economize on notation, we let
 

rt+1 denote the log excess market return minus its sample mean and assume that the predictor
variables xt are de-meaned prior to running the regression.
To identify periods with return predictability, we follow the nonparametric estimation strategy
developed in Robinson (1989) and Cai (2007) which is valid regardless of whether the linear return
prediction model in (1) is correctly specified. Using nonparametric methods for pocket identification
offers the major advantage that we do not need to take a stand on the dynamics of local return
predictability, e.g., whether such predictability is short-lived or long-lived and whether it disappears
slowly or rapidly. Instead, our nonparametric methods allow us to characterize the “shape” of the
pockets, e.g., the duration and frequency of pockets and the amount of return predictability inside
the pockets which can provide important clues about the economic sources of return predictability.11
The nonparametric approach views β : [0, 1] → Rd as a smooth function of time that can have
at most finitely many discontinuities. The problem of estimating β t for t = 1, . . . , T can then be
thought of as estimating the function β at finitely many points β t = β Tt .12


While Appendix B provides additional details, our basic approach for the nonparametric analysis
is as follows. We use a local constant model to compute the estimator of β t as

T
X 2
KhT (s − t) rt+1 − x0s β 0 .

β̂ t = arg min (3)
β 0 ∈Rd s=1

The weights on the local observations get controlled through the kernel KhT (u) ≡ K (u/hT ) / (hT ) ,
where h is the bandwidth. The estimator in (3) can be viewed as a series of weighted least squares
regressions with Taylor expansions of β around each point t/T. The weighting of observations in
(3) can be contrasted with the familiar rolling window estimator which uses a flat kernel that
puts equal weights on observations in a certain neighborhood. For this estimator KhT (s − t) = 1
if t ∈ [t − bhT c, t + bhT c], otherwise KhT (s − t) = 0. Our preferred estimator differs from the
conventional rolling window approach – which can be a fairly inefficient way to picking up time
variation in β if the build-up and disappearance of such patterns is more gradual (i.e., β t is smooth),
11
Although nonparametric kernel regression is not widely used in finance, papers such as Ang and Kristensen (2012)
have used this approach to estimate and test conditional CAPM alphas and betas.
12
Because time, t, is normalized by the number of observations T , β is a function whose domain is [0, 1] as opposed
to [0, T ]. This is useful because we need more and more local information to consistently estimate β t as T → ∞.

7
as we might expect a priori – by allowing for KhT (·) to be a smooth function which decreases as it
moves away further from t.13
To test if local predictability could have been identified in real time, we estimate our model
using a one-sided analog of the Epanechnikov Kernel:

3
1 − u2 1 {−1 < u < 0} ,

K(u) = (4)
2

ensuring that only past data are used to capture local return predictability. Our baseline results
use a 2.5-year one-sided bandwidth, chosen as half the length of a two-sided five-year kernel which
is a standard choice of rolling window in many finance applications.
As a measure of relative predictive accuracy, define the squared error difference (SED) between
some benchmark forecast, rt|t−1 , and the forecast from the local regression model, r̂t|t−1 :

SEDt = (rt − rt|t−1 )2 − (rt − r̂t|t−1 )2 . (5)

Periods in which SEDt > 0 signify that the kernel regression produced a more accurate forecast
(in a squared error sense) than the benchmark since it incurred a smaller (squared) forecast error.
To help identify such periods, we project SEDt on a constant and a time trend

SEDt = γ 0,t + γ 1,t t + vt . (6)

We estimate γ 0,t and γ 1,t using again a one-sided Epanechnikov Kernel. We then define predictabil-
ity pockets as periods for which SED
[ t = γ̂ 0,t + γ̂ 1,t t > 0. At the onset of a pocket we would expect
that γ
b1,t > 0, indicating that recent values of the benchmark model’s squared forecast errors are
beginning to exceed those from the local kernel model. Conversely, after the SED measure has
b1,t < 0, indicating waning return predictability.14
peaked, we would expect γ
Our estimates of γ 0,t and γ 1,t use a shorter one-year bandwidth because the pocket detection
regression in equation (6) includes a time trend as a predictor. A priori we would expect such
a trend to be very local and not last too long since this would imply an unreasonable buildup in
return predictability. Using a shorter window to estimate the γ coefficients will, of course, produce
larger estimation errors, but this is not so important here because of our use of a robust pocket
identification scheme based on the sign of SED
[ t.
Intuitively, combining the time trend in (6) with our local kernel weighting scheme allows us
to identify temporary, possibly short-lived, patterns in return predictability, lending our pocket
definition a number of advantages. First, a pocket is triggered if the local return prediction model
13
A rolling window estimator loses some efficiency by not using any information from outside the fitting window
and also by assigning the same weight to all observations inside the window. Usually, it is more efficient to give
a lower weight to observations far away from t relative to observations extremely close to t, because the latter are
presumably more representative than the former, and thus present a more favorable bias/variance tradeoff.
14
Indeed, this is a consistent pattern that we observe across all predictors in our empirical analysis.

8
is deemed more accurate than the benchmark in the sense that it produces a lower expected squared
forecast error. The definition therefore explicitly accounts for estimation uncertainty: Even if the
true current value of β t in (2) is high, this may not produce a pocket if β t cannot be estimated
sufficiently accurately, e.g., because returns have been very volatile (heteroskedasticity) or because
β t has not been high for long enough to allow our local estimation scheme in (3) to pick this up.
Second, our definition builds on the practice started by Welch and Goyal (2008) of studying
how return predictability evolves over time through sums of squared forecast error differences. Dif-
ferences in squared forecast errors are also the basis for formal comparisons of economic forecasting
performance in the tests of Diebold and Mariano (1995) and Clark and West (2007). These tests do
not include a local time trend, however. A novelty of our approach is that it allows us to identify
temporary return predictability through a local estimate of the trend in the relative accuracy of
the return forecasts.
Third, our pocket definition does not require us to compute standard errors for the estimates
γ̂ 0,t , γ̂ 1,t since we do not conduct formal hypothesis tests to identify pockets and hence do not
have to decide on a significance level. This is particularly important for out-of-sample estimation
since one-sided local kernel estimates of standard errors can be imprecise. Our definition also does
not impose any minimum requirements on the length of the pockets. In practice, this means that
short-lived pockets will sometimes be triggered “false alarms”). One could easily impose that a
pocket is triggered only after a certain number of periods for which SED
[ t > 0. Such a rule would
come at the cost of delaying pocket identification, however, so we do not further pursue this idea.
On a final note, all of our estimates are computed recursively, out-of-sample, using only real-time
information available prior to the period whose returns are being predicted. Specifically, we obtain
the estimates γ̂ 0,t and γ̂ 1,t in equation (6) from a one-sided kernel using only information known at
time t. We then define predictability pockets as periods (days), t, for which SED[ t = γ̂ 0,t + γ̂ 1,t t > 0.
If, on day t, SED
[ t > 0, then we use the forecasts of returns in period t + 1 from the local kernel
regression, r̂t+1|t = x0t β̂ t , where β̂ t again uses only information known at time t.

2.3 Measures of Pocket Characteristics


To help understand pockets of predictability, we measure their characteristics in a variety of ways.
First, we want to know how many contiguous pockets, Np , our procedure detects along with how
long the pockets last. To this end, let Ijt = 1 for time-series observations inside the jth pocket,
while Ijt = 0 outside this pocket for t = 1, ..., T . Denoting by t0j and t1j the start and end date of
the jth pocket, the duration of pocket j, Durj , is given by

T
X
Durj = Ijτ = t1j − t0j + 1, j = 1, ..., Np . (7)
τ =1

Long-lived pockets should, all else equal, be easier for investors to detect and exploit.

9
Pocket durations do not capture the total amount of predictability which also depends on the
magnitude of the local predictability. We quantify this through the local R2 at time t, Rt2 :15
PT
KhT (s − t) (rs − r̂t|t−1 )2
Rt2 = 1 − PTs=1 . (8)
s=1 KhT (s − t) (rs − r̄s|s−1 )2

We measure the total amount of return predictability inside a pocket by means of the integral
R2 measure (IR) which, for the jth pocket, is defined as

t1j T
X X
IRj2 = Rτ2 = Ijτ Rτ2 . (9)
τ =t0j τ =1

Visually, this measure captures the area under a time-series plot of the local Rt2 values in (8),
summed across the pocket indicators. By combining the duration of a pocket with the magnitude of
the predictability inside the pocket, the IR2 measure provides insights into how much predictability
is present as well as how feasible it is for investors to detect and exploit such predictability.

3 Empirical Results
This section introduces our data on stock returns and predictor variables, presents empirical evi-
dence from applying the non-parametric approach to identifying local return predictability pockets
and, finally, tests whether this evidence is consistent with the conventional constant-coefficient
return prediction model in (1).

3.1 Data
Empirical studies on predictability of stock returns generally use monthly, quarterly, or annual
returns data. Data observed at these frequencies can miss episodes with return predictability at
times when the slope coefficients (β t ) change quickly, making it harder to accurately capture and
time such episodes. Being concerned here with local return predictability, which may be relatively
short-lived, we therefore initially use daily data on both stock returns and the predictor variables.
Following conventional practice in studies such as Welch and Goyal (2008), Dangl and Halling
(2012), Johannes, Korteweg and Polson (2014), and Pettenuzzo, Timmermann and Valkanov (2014),
our main empirical analysis considers univariate prediction models that include one time-varying
predictor at a time, i.e., rt+1 = xt β t + εt+1 . The univariate approach is well suited to our non-
parametric analysis which benefits from keeping the dimensionality of the set of predictors low.
However, it raises issues related to omitted state variables, so we subsequently also discuss multi-
variate extensions.
15
Note that this measure can be negative in certain periods because our time-varying coefficient model does not
nest the prevailing mean model which is the reference model in the denominator.

10
In all our return regressions, the dependent variable is the value-weighted CRSP US stock
market return minus the one-day return on a short T-bill rate. Turning to the predictors, we
consider four variables that have been used in numerous studies on return predictability and are
included in the list of predictors considered by Welch and Goyal (2008). First, we use the lagged
dividend-price (dp) ratio, defined as dividends over the most recent 12-month period divided by
the stock price at close of a given day, t. This predictor has been used in studies such as Keim
and Stambaugh (1986), Campbell (1987), Campbell and Shiller (1988), Fama and French (1988),
Fama and French (1989) and many others to predict stock returns. Second, we consider the yield
on a 3-month Treasury bill. Campbell (1987) and Ang and Bekaert (2007) use this as a predictor
of stock returns. As our third predictor, we use the term spread, defined as the difference in yields
on a 10-year Treasury bond and a three month Treasury bill.16 Finally, we also consider a realized
variance measure, defined as the realized variance over the previous 60 days. Again, this variable
has been used as a predictor in a number of studies of stock returns.
The final sample date is 12/31/2016 for all series. However, the beginning of the data samples
varies across the four predictor variables. Specifically, it begins on 11/4/1926 for the dp ratio
(23,786 observations), 1/4/1954 for the 3-month T-bill rate (15,860 obs.), 1/2/1962 (13,846 obs.)
for the term spread, and 1/15/1927 (23,727 obs.) for the realized variance.
The daily predictor variables are highly persistent at the daily frequency, posing challenges for
estimation and inference with daily data. We experimented with detrending the predictors by sub-
tracting a 6-month moving average which is a common procedure, see, e.g., Ang and Bekaert (2007).
However, we found that the results do not change very much from this type of detrending and so go
with the simpler approach of using raw data. In practice, we address the issue of how persistence
affects inference through bootstrap simulations that incorporate the high persistence of our daily
predictors along with other features of the daily data such as pronounced heteroskedasticity.
On economic grounds, we would expect return predictability to be very weak at the daily
horizon. Table 1 confirms that this holds. The table shows full-sample coefficient estimates obtained
from the linear regression model in (1) along with t-statistics and R2 values. Only the regressions
that use the T-bill rate (t-statistic of -2.78) and the term spread (t-statistic of 2.31) generate
statistically significant slope coefficients. As expected, the average predictability is extremely low
at the daily frequency with in-sample R̄2 values varying from 0.0004% for the realized variance
measure to 0.053% (i.e., 0.00053) for the regression that uses the T-bill rate as a predictor.
The lower and middle panels in Table 1 report statistics from full-sample return regressions split
separately into periods identified, in real time as we explain below, as pockets versus non-pockets.
Very large differences emerge across these two samples. In-pocket slope coefficients are notably
higher for three of the four predictor variables compared to outside the pockets, the only exception
being the realized variance. Despite being based on a much shorter sample, the in-pocket regression
coefficients are now highly statistically significant for the dp ratio (t-statistic of 2.55), T-bill rate
16
See Keim and Stambaugh (1986) and Welch and Goyal (2008) for studies using this predictor.

11
2
(-3.29), and the term spread (3.95). R -values are essentially zero outside pockets but far higher
inside the pockets for the dp ratio (0.18%), T-bill rate (0.37%), and term spread (1.47%).17
2
In-pocket R values are, thus, orders of magnitude higher than the “average” return predictabil-
ity found in the full sample (Panel A). While most of the time return predictability is extremely
low at the daily frequency, some periods seemingly exhibit substantially higher predictability. We
next provide more details of how we identify those periods and where they are located.

3.2 Pockets of Local Return Predictability


Table 2 reports summary statistics for the number of pockets identified by our nonparametric
procedure along with minimum, maximum and mean values for the duration and IR2 measure.
The return regression based on the dp predictor identifies 18 pockets whose durations range from
very short (16 trading days) to much longer (610 days), averaging 193 days, or 9 months. Overall,
pockets are identified for 15% of all days in the sample. Fewer pockets (12) are identified for the
model that uses the T-bill rate predictor. However, the duration of these pockets is notably longer,
ranging from 57 to 672 days and averaging 292 days, or 14 months. These long durations mean
that pockets are identified for 24% of the days in the sample.
Seven pockets with a mean duration of 258 days (12 months) get identified for the term spread
predictor, while for the realized variance predictor, we find 16 pockets whose durations range from
25 to 1,302 days (five years), averaging 302 days (14 months).
Next, consider the amount of return predictability computed for the individual pockets. The
bottom rows in Table 2 show that the IR2 measure for the dp predictor has a mean of 1.51 and
ranges from -0.24 to 4.76. As a reference, note that a one-year (253 trading day) period with an
average daily Rt2 value of 0.004 (or 0.4%) produces an IR2 value of one. IR2 values average 3.70
for the T-bill rate predictor with a maximum value of 11.69–both far higher than the values found
for the dp predictor. The mean IR2 value is 2.92 for the term spread predictor, while the maximum
equals 7.54, again higher than for the dp ratio but lower than for the T-bill rate. The very long
pockets found for the realized variance predictor generate fairly high IR2 measures averaging 2.77
and peaking at a value of 16.42.18
A comparison of returns inside and outside the pockets (available in Appendix Table A.1) shows
that mean returns are marginally higher inside periods identified as pockets. With exception of the
pockets identified by the dp-ratio, the first-order autocorrelation of returns is also higher inside the
pockets, ranging from 0.12 for the realized variance to 0.22 for the T-bill rate. Conversely, returns
are less volatile inside pockets and have a larger negative skew for two of the four predictors (dp
and realized variance) but only half the kurtosis compared to returns outside the pockets.
We conclude from these results that return predictability varies significantly over time and that
17 2
These pockets are identified using ex-ante available information, but R values are estimated on the full sample.
18
Across our four predictors, pairwise correlations between the local R2 values range from -0.05 to 0.57.

12
our nonparametric regression approach is able to detect local pockets of return predictability in
real time. We next conduct a set of more formal tests of these findings.

3.3 Tests for Spurious Pockets


Because we use a new approach for identifying local return predictability, it is worth exploring its
statistical properties. For example, we are interested in knowing to what extent our approach spu-
riously identifies pockets of return predictability. Since we repeatedly compute local (overlapping)
test statistics, we are bound to find evidence of some pockets even in the absence of genuine return
predictability. The question is whether we find more pockets than we would expect by random
chance, given a reasonable model for the daily return dynamics. Another issue is whether shorter
pockets or pockets with low IR2 values are more likely to be spurious than longer ones.

3.3.1 Simulation Approach

We consider three different ways of simulating stock returns. To address the effect of using highly
persistent predictor variables on pocket detection, all three approaches assume a constant-coefficient
null for a predictor variable that follows an AR(1).
The first specification assumes homoskedastic errors and takes the form

rt+1 = µr + γxt + εr,t+1 , εr,t+1 ∼ (0, σ 2r ), (10)


xt+1 = µx + ρxt + εx,t+1 , εx,t+1 ∼ (0, σ 2x ).

We estimate µr , γ, µx , and ρ by OLS. To allow returns to follow a non-Gaussian distribution, we


draw the zero-mean innovations ε̂r,t+1 = rt+1 − µ̂r − γ̂xt and ε̂x,t+1 = xt+1 − µ̂x − ρ̂xt by means of
an i.i.d. bootstrap. Any cross-sectional dependencies are preserved by resampling the residuals in
−1 T
pairs with replacement from {ε̂r,t+1 , ε̂x,t+1 }Tt=0 . Bootstrap samples of residuals ε̂br,t+1 , ε̂bx,t+1 t=0


are then used to iteratively construct bootstrap samples for r and x using (10) with xb0 = 0.
We account for the pronounced time-varying volatility in daily returns through two additional
variants: a stationary block bootstrap and an EGARCH(1,1) model with t-distributed shocks.
The stationary block bootstrap selects the optimal block length using the method proposed by
−1
Politis and White (2004) applied to the residuals from the return regression, {ε̂r,t+1 }Tt=0 in (10). As
−1
in the i.i.d. case, blocks of residuals are resampled in pairs with replacement from {ε̂r,t+1 , ε̂x,t+1 }Tt=0
to preserve cross-sectional correlation.

13
Lastly, the EGARCH(1,1) model is given by:
p
rt+1 = µr + γxt + εr,t+1 ≡ µr + γxt + hr,t ur,t+1 , ur,t+1 ∼ t(ν r ) (11)
ln hr,t+1 = ω r + αr (|ur,t+1 | − E[|ur,t+1 |]) + γ r ur,t + β r ln hr,t
p
xt+1 = µx + ρxt + εx,t+1 ≡ µx + ρxt + hx,t ux,t+1 , ux,t+1 ∼ t(ν x )
ln hx,t+1 = ω x + αx (|ux,t+1 | − E[|ux,t+1 |]) + γ x ux,t + β x ln hx,t .

To simulate from this model,qwe first estimate the parameters and construct q normalized residuals
ûr,t+1 = (rt+1 − µ̂r − γ̂xt )/ ĥr,t and ûx,t+1 = (xt+1 − µ̂x − ρ̂xt )/ ĥx,t . We then sample pairs
−1
ûr,t+1 , ûbx,t+1 i.i.d. with replacement from {ûr,t+1 , ûx,t+1 }Tt=0
 b
. We construct bootstrap samples
for r, x, hr , and hx using (11) and setting xb0 = 0 and hbr and hbx equal to their estimated means.
 b T −1
For each of the three specifications, we generate 1,000 bootstrap samples rt+1 , xbt+1 t=0 .
Our simulations follow the empirical analysis and define pockets as periods where the prevailing
mean model is expected to have a larger squared error than the local return predictions. For each
bootstrap sample, we record the distribution of IR2 values from (9) and use this to compute p-values
for overall sample statistics for the pocket distribution as well as for the individual pockets.

3.3.2 Significance of Individual Pockets

To get a sense of the location and duration of the pockets, Figure 1 plots one-sided non-parametric
kernel estimates of SED
[ t against time for each of the four predictors. Shaded areas represent
periods identified as pockets of predictability. We distinguish between spurious and non-spurious
pockets by looking at each individual pocket’s IR2 value and computing the percentage of simula-
tions with at least one pocket matching this value. This produces an odds ratio with small values
indicating how difficult it is to match the total amount of predictability observed for the individual
pockets.19 We color pocket areas based on whether the pockets have less than (red) or more than
(blue) a 5% chance of being randomly generated.20
First consider the predictability plot for the dp predictor (top panel). The longest pockets occur
during the Korean War, prior to the 1990 recession and in the aftermath of the Great Recession.
Conversely, there are relatively long spells without any (long-lasting) pockets prior to 1950 and,
again, between the mid-seventies and late eighties.21 Eight of the 18 pockets identified using the
dp ratio as our predictor are statistically significant at the 5% level. Conversely, all of the shorter
pockets can be attributed to sampling error.
For the T-bill rate predictor (second panel), we locate three long-lived pockets, each lasting at
19
Returns simulated under the special case of no return predictability yield very similar results to those reported
here, as can be seen in Appendix Table A.2.
20
Appendix Table A.3 displays the simulated p-values for the individual pockets identified by our procedure.
21
Pockets do not necessarily coincide with high values of the estimated slope coefficient, β̂ t . For example, a sudden
spike in β̂ t preceded by small values of β̂ t will not produce a high value of SEDt and so will not trigger a pocket.

14
least two years, around 1970, in the aftermath of the early-seventies oil price shocks, and around
the Fed’s Monetarist Experiment (1979-81). Nine of the twelve pockets identified by the T-bill rate
model are significant at the 5% level, leaving only three insignificant pockets.
Most of the pockets identified by the term spread predictor (third panel) occur during the mid-
seventies and early eighties although we also locate two pockets in the mid-nineties. Five of the
seven pockets are significant at the 5% level.
For the realized variance predictor (fourth panel), pocket incidence is fairly evenly spread out
across the sample with the longest-lived pocket occurring during the Korean War, just as we found
for the dp ratio. Long pockets also occur in the late sixties and in the aftermath of the Monetarist
Experiment. This model identifies 16 pockets, 12 of which are significant at the 5% level.
Pairwise time-series correlations between the four pocket indicators depicted in Figure 1 range
from -0.02 to 0.59, indicating some overlap but also a fair amount of independent variation across
pockets identified by different predictor variables.
We conclude from these simulations that the majority of return predictability pockets identified
by our nonparametric return regressions cannot be explained by any of the return generating models
considered here. This is particularly true for the T-bill rate, term spread, and realized variance
predictors. The simulations do not come close to matching the amount of predictability observed in
the longer-lived pockets. Conversely, the shortest pockets can be due to “chance” and are matched
in many of our simulations. This point is particularly relevant for the dp regressions which are
more prone to pick up spurious, short-lived pockets. Reassuringly, since the model in equation
(11) allows for highly persistent predictors and time-varying heteroskedasticity, these features of
our data do not seem to give rise to the return predictability pockets that we observe.

4 Statistical and Economic Performance of Return Forecasts


A large part of the literature on return predictability considers linear, constant-coefficient models
based on a single predictor variable. Welch and Goyal (2008) find that such models fail to produce
more accurate out-of-sample return forecasts than those from the prevailing mean model.
To address such shortcomings, one approach is to impose economically motivated constraints
on the forecasts. Following Campbell and Thompson (2008), our analysis therefore considers three
alternative ways of constructing out-of-sample excess return forecasts that, to varying degrees,
incorporate economic restrictions: (i) unrestricted forecasts, rbt+1|t ; (ii) non-negative forecasts that
replace negative forecasts with zero, max(0, rbt+1|t ); and (iii) return forecasts that, in addition to
imposing the constraint in (ii) sets β
b = 0 if the estimated slope coefficient is inconsistent with
t
our prior expectation of its sign (positive for the dp ratio, term spread, and realized variance, and
negative for the T-bill rate).
A second approach is to incorporate multivariate information in the return prediction models.
We describe alternative ways to do so further below.

15
4.1 Performance Measures
We first explain how we evaluate the performance of our local return forecasts using both statistical
and economic performance measures. Following Welch and Goyal (2008), we compare our one-sided
return forecasts against forecasts from a prevailing mean model, r̄t+1|t = 1t ts=1 rs . To test the
P

null of equal predictive accuracy, we use a Clark and West (2007) (CW) test with positive values
indicating that the local, one-sided forecasting approach improves on the prevailing mean.
The Clark and West test has three main advantages over conventional test procedures such
as Diebold and Mariano (1995) and Clark and McCracken (2001). First, unlike the Diebold and
Mariano (1995) test, it can be used to compare the accuracy of out-of-sample forecasts from nested
prediction models as is frequently encountered in finance. Second, unlike the Clark and McCracken
(2001) test, the Clark and West statistic can be compared to critical values from the standard
normal distribution and does not rely on simulated critical values. Third, the Clark and West
test accounts for the greater finite-sample effect that parameter estimation error can be expected
to have on the bigger model (relative to the prevailing mean) and so better summarizes the true
predictive power of the underlying state variable(s) in the bigger model.
To assess the economic significance of our forecasting results, we adopt a strategy similar to
that of Gomez Cram (2021) to construct a mean-variance optimized pocket portfolio invested in
stocks and T-bills. Each forecasting model is used to compute real-time forecasts of expected excess
returns, Et [rt+1 ], and form a managed portfolio with excess returns

p
rt+1 = c · Et [rt+1 ] · rt+1 , (12)

where rt+1 is the realized market excess return and the constant c is defined as
 1/2
V ar(rt+1 )
c≡
V ar(Et [rt+1 ] · rt+1 )

The weight placed on the market is given by c · Et [rt+1 ], which we restrict to be between 0 and 2,
ruling out short sales and capping the leverage ratio at two.
Next, we use the excess returns on the managed pocket portfolio (12) to estimate the risk-
adjusted return (α) from the regression:

p
rt+1 = α + βrt+1 + t+1 , t+1 ∼ (0, σ ε ).

In addition, as is common practice, we compute the Sharpe ratio for the managed portfolio.

4.2 Univariate Return Forecasts


Panel A in Table 3 reports the outcome of the CW tests. Across all days in the out-of-sample period
(column 1), the prevailing mean forecasts and the unrestricted local return forecasts are broadly

16
equally accurate and the null of equal predictive accuracy does not get rejected. Thus, local return
predictability could not have been exploited in real time to produce daily return forecasts that “on
average” were more accurate than forecasts from a model that assumes a constant equity premium.
Inside the local pockets (column 2), the CW test statistics are instead positive and highly statis-
tically significant for all four predictors. Outside the pockets (column 3), all four predictor models
produce very poor forecasting performance with negative CW test statistics that are significant at
the 10% level or above. Imposing the economic constraint that forecasts of excess returns cannot be
negative (columns 4-6) leads to improvements in all four one-sided kernel forecasts which, for the
T-bill rate, are now significantly more accurate at the 5% level even in the full sample, in addition
to being significant at the 1% level for all four predictors inside the pockets. The constraint does
not notably improve predictive accuracy out-of-pocket, however. Imposing additional sign restric-
tions on the slope coefficients (columns 7-9) leads to similar performance as that of the model that
only restricts the sign of the return forecasts.22
We next examine the economic performance results reported in Panel B in Table 3. For the
unrestricted univariate prediction models, the risk-adjusted return (α) is economically large and
highly statistically significant for the dp ratio (1.69% per annum), T-bill rate (3.57%), term spread
(3.14%), and realized variance (2.31%) predictors. The associated Sharpe ratios range from 0.47
for the dp ratio to 0.79 for the T-bill rate.23 For comparison, the prevailing mean forecasts generate
a negative α (-0.25) and a Sharpe ratio of 0.46.
Imposing the restriction that forecasts of mean excess returns should be non-negative leads to
improvements in all three performance measures. Alphas now range from 2.51% (dp ratio) to 6.48%
(T-bill rate), while Sharpe ratios increase more marginally. Imposing sign restrictions on the slope
estimates yields broadly similar risk-adjusted return performance as imposing the sign restriction
on the predicted excess return.

4.3 Incorporating Multivariate Information


We next consider ways in which multivariate information can be incorporated into the forecasts.
Based on economic reasoning or more formal model selection methods (Pesaran and Timmermann,
1995), a first approach is to identify a small set of included predictors.24 In our analysis we consider
a multivariate local kernel regression model (3) which simultaneously uses all four predictors–all
of which can be economically motivated–to construct forecasts. The model is estimated on the
subsample for which all four predictor variables are available, and we use a product kernel where
22
Cumulative sum of squared error plots similar to those in Welch and Goyal (2008), described in Appendix C
and displayed in Appendix Figure A.1, show that the local kernel regressions outperform the prevailing mean model
fairly steadily inside pockets while the opposite holds true outside pockets.
23
The smaller α estimates for the forecasting model that uses the dp ratio are largely a result of this model bumping
up against the (upper) constraints on the portfolio weights inside pockets.
24
Including a large number of predictors (“kitchen sink”) generally leads to very poor out-of-sample forecasting
performance due to estimation error.

17
each variable is assigned the same bandwidth.
Second, dimensionality reduction methods such as principal components (PC) can be applied
directly on the set of predictors to form linear combinations that explain as much of the common
variation in the predictors as possible (see e.g., Pettenuzzo, Timmermann and Valkanov, 2014). We
apply PC in real time to extract the first principal component (pc) from the four predictors.
Third, forecast combination methods can be used to form averages of the forecasts produced
by small (univariate) models; see Rapach, Strauss and Zhou (2010). We consider three different
combination schemes. The first (comb1) sets an individual predictor’s forecast to the local kernel
forecast (bi
rt+1|t ) inside pockets, otherwise reverting to the prevailing mean (rt+1|t ) if no pocket has
been identified by the predictor, before computing an equal-weighted average:

4
comb1 1 X [ i

ybt+1|t = 1{SEDit ≥ 0}b
rt+1|t + 1{SED
[ it < 0}rt+1|t , (13)
4
i=1

[ it ≥ 0} equals one if expected value of the local squared forecast error


where the indicator 1{SED
differential exceeds zero for predictor i, otherwise equals zero. For example, if the first univariate
prediction model identifies a pocket while the remaining models do not, comb1 weights the forecast
from the first model by 25% and the prevailing mean by 75%.
The second combination (comb2) ignores forecasts from models that do not currently identify
a pocket provided that at least one variable has identified a pocket:
( P4
1 i
comb2 nt i=1 1{SED it
[ ≥ 0}b
rt+1|t if nt ≥ 0
ybt+1|t = , (14)
rt+1|t if nt = 0

P4
where nt = i=1 1{SED it
[ ≥ 0} is the number of predictors that identify a pocket at time t.
The third combination (comb3) makes no distinction between pocket and non-pocket periods,
always using the simple equal-weighted average of all four univariate models:

4
comb3 1X i
ybt+1|t = rbt+1|t . (15)
4
i=1

4.3.1 Empirical results

Rows 5-9 in Table 3 report the empirical performance of the multivariate prediction schemes. The
fifth row in Panel A shows that the multivariate kernel approach delivers good out-of-sample fore-
casting performance inside pockets with CW test statistics of 3.74 and 4.01 for the unrestricted and
two sign-restricted forecasts, respectively. Predictive accuracy on out-of-pocket days is comparable
to that of the univariate forecasting models.25
25
The six pockets identified by this approach that includes all four predictors (shown in the bottom panel in Figure
1) overlap to some extent with the pockets identified by the univariate kernel regressions.

18
Table 3 further shows that the PC approach delivers very good out-of-sample forecasting per-
formance inside the pockets with CW test statistics of 2.71 and 4.69 for the unrestricted and two
sign-restricted forecasts, respectively. Moreover, while the PC forecasts underperform outside the
pockets, they do so to a smaller extent than the univariate forecasts and so are more accurate in
the full sample for 10 of the 12 pairwise comparisons against the univariate models.
Among the combination methods, comb1 and comb2 generate positive and highly significant
CW test statistics both for the full sample and in-pocket periods regardless of whether we combine
forecasts from the unrestricted or restricted univariate models. In contrast, the simple equal-
weighted average (comb3) performs worse than the underlying univariate forecasts. Since this
approach does not distinguish between in-pocket and out-of-pocket periods, this suggests that such
conditioning is important to the benefits from forecast combination.
Examining the economic performance measures, we find that the PC approach performs very
well with alpha estimates and Sharpe ratios close to those of the best-performing univariate models.
The combination methods that condition the underlying forecasts on whether a pocket has been
identified (comb1 and comb2) produce the best overall economic performance while the equal-
weighted combination (comb3) produces poor economic performance.

4.4 Simulation Evidence


The empirical evidence summarized above demonstrates that the local kernel regressions can gen-
erate forecasts that are significantly more accurate than the benchmark inside ex-ante identified
pockets, though not outside these pockets or in the full sample. As an additional robustness check,
we use our Monte Carlo simulation setup from Section 3 to explore whether similar improvements
in predictive accuracy can be achieved by the statistical models introduced earlier.
Table 4 summarizes results from simulating the three models and generating forecasts along
the unrestricted and restricted schemes described earlier. The simulations are conducted under
the null of constant return predictability (γ 6= 0), but all results are robust to assuming no return
predictability (γ = 0) as shown in Appendix Table A.2.
The results are very clear and easily summarized: For all models, the simulations match both
the full-sample and out-of-pocket CW test statistics. Conversely, we find no single instance in which
the simulations match the in-pocket CW statistic for any predictor or for any of the forecasting
schemes. For the economic performance measures, the statistical models match the Sharpe ratio
in some cases but fail to match the alphas or alpha t-statistics.26
Another possible concern that could affect our results is related to the Stambaugh (1999) bias
which affects the estimated slope coefficient of return prediction models in cases where the predictor
variable follows a highly persistent process and the correlation between innovations to the predictor
variable and shocks to the return equation is large. Through a set of simulations described in
26
Alpha t-statistics are added because they have better sampling properties than alpha estimates.

19
Appendix D and displayed in Table A.4, we show that this bias does not lead us to spuriously
identify pockets, largely because of our use of an out-of-sample pocket identification approach.

4.5 Local Prevailing Mean Benchmark


So far, we followed studies such as Welch and Goyal (2008) and benchmarked our return forecasts
against a “global prevailing mean” that uses an expanding estimation window. However, a “local
prevailing mean” model is an interesting alternative benchmark since this enables us to determine
if our kernel regression forecasts are simply picking up local return momentum. To explore this
lpm
= t−1
P
point, let r̃t|t−1 τ =1 K(τ )r(τ ) be the prediction from the local prevailing mean (lpm) model
and replace equation (5) with

SEDtlpm = (rt − r̃t|t−1


lpm 2
) − (rt − r̂t|t−1 )2 . (16)

We can then apply our kernel regression in (6) to estimate a local trend in SEDtlpm and identify
pockets.
The results, reported in Appendix Tables A.5 and A.6, show that our kernel regression forecasts
based on time-varying predictors perform well relative to forecasts from the local prevailing mean
model, producing strong economic performance and highly significant CW test statistics in-pocket
and small but mostly statistically insignificant test statistics out-of-pocket.
In a second exercise, we revert to using return forecasts from the global prevailing mean model
to detect pockets, but instead measure predictive accuracy against the local prevailing mean model
so as to explore whether, inside the pockets identified by our predictors, their return forecasts are
more accurate than forecasts from the local prevailing mean. This would not hold if our pockets
were merely picking up local return momentum.
In results reported in Appendix Tables A.7 and A.8, we continue to find that the time-varying
predictors produce strong economic performance and highly significant CW test statistics inside
the pockets, though not outside pockets.
As a final exercise, we use again forecasts from the global prevailing mean model to identify
local pockets and benchmark our return forecasts against. However, now we also consider the
pockets identified by the local prevailing mean model by comparing the accuracy of its return
forecasts to the return forecasts from the global prevailing mean. Next, to examine if our time-
varying predictors contain additional information that is not present in past returns, we consider
the performance of our time-varying predictor models in those periods they identify as pockets that
are not also identified as pockets by the local prevailing mean model. Pockets identified in this
manner can thus be attributed to the additional information in the time-varying predictors that
is not contained in the local prevailing mean forecast. In this analysis, only pockets that do not
overlap with those identified by the local prevailing mean model are singled out. All other periods
are classified as out-of-pocket.

20
Despite the reduction in the number of in-pocket observations associated with this scheme, for
most of the predictors and the first two forecast combination schemes we continue to find significant
improvements in predictive accuracy inside the pockets not identified as such by the local prevailing
mean. Moreover, these gains in predictive accuracy strengthen notably from imposing economic
constraints. We also find significant economic gains for all predictors with exception of the dp ratio
forecasts whose alpha estimates remain positive, though not significant. Details of these results are
reported in Appendix Table A.9.

4.6 Choice of Bandwidth


Our pocket identification scheme relies on two windows, namely the estimation window used by
the local kernel regression to generate return forecasts and the performance monitoring window
used to capture if these forecasts are expected to produce a lower squared forecast error than the
benchmark model. Our baseline results set these windows to 2.5 years–half of a five-year two-sided
window–and one year, respectively. We set the estimation window slightly longer due to the well-
known adverse effect of parameter estimation error in an inherently noisy environment, while the
shorter monitoring window reflects our prior that local return predictability cannot last too long.
To explore the robustness of our results with regards to these choices, we let the estimation
window vary between 2 and 3 years–corresponding to two-sided windows of 4 and 6 years–while
the window used to track SED values varies between 6 and 18 months.27
Table 5 reports results from the robustness analysis with in-pocket and out-of-pocket results
listed in the right and left columns, respectively. In both cases, the first column lists the results
from the baseline scenario. The CW test statistics are highly robust to changes in the window
lengths–slightly better for the short monitoring window and slightly worse for the longer one–as
we continue to find strong evidence that both the univariate and multivariate approaches produce
significantly more accurate in-pocket return forecasts than the prevailing mean model, but less
accurate forecasts out-of-pocket.
A similar set of robustness tests applied to the economic performance measures yield the same
conclusion, namely that a broad range of choices of the two window sizes lead to highly significant
alpha estimates for the managed portfolios that use our pocket methodology.
27
The vast majority of our results continue to hold for additional parameter configurations, including ones in
which the two bandwidth parameters are identical, e.g., both 1.5 years or 2 years. However, the performance of
the forecasting models based on the dp ratio and rvar starts deteriorating when the bandwidth parameters used for
pocket detection and parameter estimation are both short. This is what we would expect because these variables
have noisier time series which means that the combined effect of estimation error in the two regression steps starts to
dominate for these predictors. We do not observe this effect for the other variables or for the combination approaches.
We also find that our results are robust to longer windows such as a kernel estimation window of five years.

21
4.7 Controlling for Volatility, Momentum, and Transaction Costs
We next examine the effect of controlling for portfolios that manage volatility and momentum.
Following Moreira and Muir (2017), we define a volatility factor as

σ c
ft+1 ≡ rt+1 ,
σ̂ 2t (r)

where rt+1 is the buy-and-hold excess return on the market, σ̂ 2t (r) is a proxy for the portfolio’s
conditional variance, and c controls the average exposure of the strategy. As in Moreira and Muir
(2017), we use the one-month realized variance estimate of excess returns as σ̂ 2t (r).
We also define a momentum factor as in Moskowitz, Ooi and Pedersen (2012):

mom rt+1
ft+1 ≡ sign(rt−252,t )c ,
σ̂ t (r)

where sign(rt−252,t ) is the sign of the excess return on the market over the past year (1 if positive,
0 otherwise), and c again controls for the average exposure of the strategy.
Results from regressions extended to include these factors are presented in Table 6. Specifically,
σ , and
we estimate α as the intercept from regressions of portfolio excess returns, rp,t+1 , on rt+1 , ft+1
mom . While controlling for these factors slightly reduces performance, all α estimates, except those
ft+1
associated with the equal-weighted combination, remain statistically and economically significant.
As an alternative approach to controlling for volatility, we conduct an additional version of
the trading strategy in which we construct portfolio weights by dividing expected returns from
each model by our measure of realized variance, rvar, which can be viewed as a proxy for the
conditional return variance. If our time-varying mean forecasts are mainly identifying periods with
high return volatility (indicating a constant risk-return trade-off), this weighting scheme should
result in smoother allocations to the market portfolio. Conversely, if our local kernel return forecasts
identify a time-varying risk-return trade-off, we should expect to continue to find strong economic
performance for our trading strategy. Compared to our benchmark results, we find that accounting
for time-varying variance estimates strengthens our results in regards to estimated alphas and
Sharpe ratios (Appendix Table A.10).
Transaction costs is another concern for the interpretation of our economic performance esti-
mates. To address this issue, we examine the effect on return performance of proportional trading
costs of 1, 2, and 10 bps. Due to a modest portfolio turnover, we only observe small reductions in
alpha estimates as a result of introducing transaction costs. Our alpha estimates for the local kernel
prediction models remain strongly statistically and economically significant under all specifications
even for proportional trading costs as high as 10 bps. Results are especially strong for the trading
strategies that impose economic restrictions on the forecasts (Appendix Table A.11).28
28
Equivalently, the proportional trading costs at which the market timing strategy breaks even are quite high: 59,
129, 154, and 107 bps for the dp ratio, T-bill rate, term spread, and realized variance predictors, respectively.

22
4.8 Monthly Return Predictions
Our analysis so far used daily returns data in order to allow for the possibility that some of the local
pockets could be short-lived. However, the majority of studies in the return predictability literature
uses monthly or longer data so it is important to also conduct our analysis at this frequency to
make our results more directly comparable to the literature.
Columns to the right in Table 2 report pocket statistics for the monthly data. The number of
pockets along with the proportion of the monthly sample identified as pockets is very similar to
that identified for the daily returns data. Pocket durations (converted into days) tend to be a little
shorter in the monthly data and, as a result, the average IR2 statistics are substantially lower for
three of the four predictor variables, the only exception being the dp ratio.
Figure 2 displays the pockets identified at the monthly frequency, using the same layout as in
Figure 1. As for the daily data, we use a one-sided kernel with a bandwidth of 2.5 years. We
find clear similarities between the pockets identified using the daily and monthly data. Indeed, the
correlation between the daily pocket indicator (converted into a monthly value) and the monthly
pocket indicator is 0.51 for the dp model and 0.65 for the T-bill rate model which is very high
considering we are using a crude scheme for converting monthly values of the pocket indicator to a
daily series. For the term spread and realized variance regressions, the corresponding correlations
are 0.51 and 0.55. Pockets identified with monthly data are, thus, very similar to those identified
using daily data which is reassuring from a robustness perspective.
Using a similar simulation setup as that described in Section 3, we find that none of the pockets
identified with monthly data are statistically significant. This is in marked contrast to the results
obtained for the daily data and shows that a notable advantage from using higher-frequency data
is the associated increase in statistical power.29
Table 7 reports evidence on the statistical accuracy and economic value of our monthly out-of-
sample return forecasts. In the full sample, the statistical accuracy of the return forecasts generated
by our local regression approach (Panel A) is indistinguishable from the prevailing mean forecasts.
Inside pockets the story is completely different, however, as the CW test statistics are positive
and highly statistically significant for all four predictors. The reason for these findings is again
the poor predictive accuracy of the univariate forecasts outside the pockets. Imposing the sign
constraint on excess return forecasts does not lead to notably better full-sample performance as
the CW test statistics tend to increase inside the pockets but decrease outside pockets relative to
the unrestricted forecasts.
As in the daily data we find that the multivariate PC method performs similarly or a little
better than the univariate forecasting methods, depending on whether the unrestricted or restricted
forecasts are considered. The first two combination methods again perform very well, generating
CW test statistics that are significant both in the full sample and inside pockets with values that
29
The bootstrap procedure has weak power because it uses only information on the IR2 estimate for each individual
pocket and does not pool data across pockets to get a longer evaluation sample.

23
exceed those obtained from the underlying univariate forecasting models. Conversely, the equal-
weighted combination (comb3) performs worse than the underlying univariate forecasts.
For the economic performance measures (Panel B), we continue to find strong performance of the
univariate monthly forecasting models with patterns that resemble those found in the daily data.
Alphas are positive, economically large and highly statistically significant and improve notably
when we impose either set of economic restrictions. Sharpe ratios start low for the unrestricted
forecasts but improve by a sizeable amount once we impose the sign restrictions.
Monte Carlo simulations based on the three statistical models in Section 3, but now applied
to the monthly returns data, lead to similar conclusions as those reported in Table 4 for the daily
returns data. Specifically, all three models fail to match the observed in-pocket return predictability
although they easily match out-of-pocket results. The statistical models also fail to get close to
matching the alpha estimates observed in the monthly data (Appendix Table A.12).
Our combination that averages forecasts from models classified as being in a pocket (comb2)
achieves an out-of-sample monthly R2 value of 15.0%. Rapach, Strauss and Zhou (2010) report
quarterly recession R2 values of 4-8% using a forecast combination with 15 underlying predictors.30
We conclude from these findings that our local kernel regression approach could have been used
also at a frequency (monthly) similar to that used in the literature to identify, in real time, local
pockets with a high degree of return predictability.

4.9 Lumpiness in Return Predictability


The lumpiness that triggers pockets in our empirical exercise comes from our binary decision rule
which classifies pockets according to whether or not SED
[ t > 0, thus producing a pocket indicator
akin to the binary NBER recession indicator used to track fluctuations in economic activity. To
explore if, more broadly, our return forecasts are more accurate when SED
[ t is large and positive
compared to when it is small or negative, we also perform a simple exercise in which we compute
the accuracy of our return forecasts which we sort into four quartiles representing the days with
the lowest 25%, second-lowest 25%, second-highest 25%, and highest 25% of days ranked by the
value of SED
[ t . For each quartile, we then compute the CW test statistic.
We find that the accuracy of our return forecasts is monotonically increasing across the SED
[ t-
sorted quartiles for three of the four predictor variables, only displaying a slight non-monotonicity
for the rvar predictor.31 Similar patterns emerge with more bins, showing that our pocket identi-
fication scheme generates a strong signal about local return predictability.
30
Note that the two R2 values are not directly comparable since we choose pockets based on patterns in local return
predictability while recession R2 values instead are based on an (extraneous) economic indicator.
31
Results are shown in Appendix Figure A.2.

24
4.10 Pockets in Size and Value Factor Returns
The sticky expectations model discussed further in Section 6 provides a mechanism for generating
local predictability pockets not only in aggregate market returns but also in factor dynamics. We
therefore next explore whether local predictability pockets can be identified in the returns on the
SMB and HML Fama-French factors. These data, obtained from Ken French’s website, are available
over the same sample as the excess return and dividend-price ratio data, going back to 11/4/1926.
For the SMB series, the fraction of the sample spent inside pockets ranges between 0.24 (term
spread model) and 0.35 (dp). These values are somewhat higher than those found for the market
return series as is reflected in a longer mean duration ranging from 255 days (term spread) to 332
days (T-bill rate). The IR2 measures are also higher for this spread portfolio compared to the
market with mean values ranging from 3.78 (term spread) to 6.31 (realized variance).
Similar findings are obtained for the value-growth return series (HML). Pockets take up a
fraction of the sample for this series that range from 0.25 (realized variance) to 0.34 (term spread),
with mean durations ranging from 232 days (realized variance) to 384 days (term spread). Average
R2 values remain high, though a little below those found for the SMB series, ranging from 3.13 for
the realized variance predictor to 5.13 for the term spread.32
Table 8 reports performance results for local kernel regressions fitted to returns on the SMB
and HML portfolios. We focus on the unrestricted model forecasts since it is unclear how to impose
sign restrictions on expected return differentials or the slopes of the predictor variables.
First consider the statistical performance measures (Panel A). For both the SMB and HML
return series, and across all four predictors, inside pockets the local kernel regressions generate more
accurate out-of-sample return forecasts than the prevailing mean, resulting in highly significant CW
statistics. Conversely, the local kernel forecasts tend to be less accurate than the prevailing mean
out-of-pocket. In contrast to the results for the market portfolio, the in-pocket results dominate
for the full sample so that we now find a significantly better full-sample performance for three of
four predictors–the only exception being the realized variance.
All multivariate approaches–multivariate kernel, PC and combinations–generate forecasts that
are significantly more accurate than the benchmark both in-pocket and in the full sample, though
not during out-of-pocket periods. The first two combinations continue to be better than the simple
equal-weighted combination (comb3).
For the economic performance measures (Panel B), the alpha estimates are highly statistically
significant, ranging from 2.57% per annum for the term spread predictor to 3.43% for the realized
variance predictor applied to the SMB portfolio and from 2.33% to 3.29% for the term spread
and dp predictors applied to the HML portfolio. The first two forecast combinations boost this
performance by anywhere from 0.6% to 1.8% per annum.33
32
Appendix Figures A.3 and A.4 show the pockets identified for the HML and SMB portfolios. Detailed pocket
statistics are provided in Appendix Table A.13.
33
As in our main analysis of the market portfolio, we impose limits on portfolio weights between 0 and 2.

25
5 Pockets and Asset Pricing Models
Having presented our empirical evidence on the existence of local return predictability, we next use
our new measures of pocket characteristics as diagnostics for exploring whether a range of asset
pricing models can generate local return predictability patterns similar to those found empirically.

5.1 Overview of Models Selected


While it is impossible to explore all possible frameworks, we simulate from four workhorse rational
expectations asset pricing models which are representative of the dynamics of returns and state
variables implied by models with time-varying risk premia. In all cases, we select versions of these
models which are cast in continuous time (making it easy to simulate daily data) and employ
global solution algorithms which capture potential nonlinearities inherent in the models. Despite
matching a number of common features from the data, the models are quite distinct along a number
of dimensions which are representative of different structural explanations of the equity premium
puzzle proposed in the literature. We consider the following models:

1. A continuous-time version of the long-run risk model of Bansal and Yaron (2004), as calibrated
by Chen et al. (2009). This model features investors with Epstein-Zin preferences and two
state variables, namely the drift in the consumption growth process as well as a stochastic
volatility process that affects the mean consumption growth process.34 In the model, time
variation in the risk premium is almost exclusively driven by stochastic volatility.

2. The habit formation model of Campbell and Cochrane (1999), which features a single state
variable capturing investors’ “habit level” of consumption that generates time variation in
the effective risk aversion.35 Following a sequence of bad shocks, risk aversion and risk premia
rise, lowering asset prices.

3. The heterogeneous agents model of Gârleanu and Panageas (2015) which features two differ-
ent types of agents with different levels of risk aversion who optimally share claims on the
aggregate endowment. The model features a single state variable that captures the share of
wealth owned by one of the two types of agents. As the share of wealth owned by risk tolerant
agents decreases, risk premia rise, a force which generates excess volatility of asset prices.
34
Note that we emphasize a calibration which is more similar to the original Bansal and Yaron (2004) paper. Bansal,
Kiku and Yaron (2012) introduce an alternative calibration in which a larger fraction of variation is explained by
fluctuations in a more persistent stochastic volatility variable relative to fluctuations in the persistent expected growth
component. While we have not formally conducted simulation exercises for this specification, our existing results
suggest that adding a more persistent risk-premium shifter would strengthen Stambaugh (1999) biases and likely
hurt performance relative to the baseline presented here.
35
We use the continuous time version of the calibration from Wachter (2005), which also allows habit to affect the
risk-free interest rate.

26
4. The rare disaster model of Wachter (2013), which features investors with Epstein-Zin pref-
erences and a single state variable capturing the time-varying Poisson arrival rate of a rare
disaster, i.e., a permanent, large drop in the aggregate endowment.

In Appendix E, we provide details of how we simulate from these models, while Appendix F
and Table A.14 report a variety of unconditional moment statistics. In addition, Section 6 presents
(and draws similar conclusions from) a reduced-form present value model in the spirit of Campbell,
Lo and MacKinlay (1997) and van Binsbergen and Koijen (2010).
In each of these models, it is straightforward to construct proxies for three of our state variables,
namely the dividend-price ratio, the risk-free rate, and realized volatility of returns. As such, we
can draw initial levels of the state variables, then simulate daily samples with the same length as
our estimation sample. With these simulated times series, we compute our out-of-sample measures
of forecasting performance and several associated test statistics. Consistent with the conventions
of the rare disaster literature in making comparisons with postwar US data, we also conduct a set
of simulations where we restrict attention to sample paths where no disaster occurs.

5.2 Pitfalls of identifying short-horizon predictability


Given that all quantitative asset pricing models seek to rationalize several stylized facts from the
data, we first develop some intuition for why precisely these features suggest ex-ante that it should
be challenging for the canonical asset pricing models discussed above to generate time-varying
short-run return predictability consistent with what we find empirically.
Specifically, asset pricing models usually seek to match a fairly similar set of moments observed
in the data: 1) price-dividend ratios are stationary but quite persistent and volatile, 2) discount
rates explain a nontrivial fraction of variation in price-dividend ratios, 3) risk premia, rather than
risk-free rates, explain more of the variation in discount rates, and 4) state variables capturing
both discount rates and risk premia are usually quite persistent. The combination of these features
implies that returns are predictable, especially at longer horizons, by the price-dividend ratio with
modest R2 values over medium term horizons, consistent with evidence from predictive regressions.
To understand why the canonical asset pricing models with forward-looking rational expecta-
tions struggle to generate detectable local return predictability pockets, suppose there is a spike in
risk premia. This could happen either because a persistent state variable shifts, and/or because the
sensitivity of the risk premium to the state variable changes in a model with time-varying param-
eters. Rational, forward-looking agents will then reduce their valuation of the asset, generating an
immediate offsetting effect on realized returns. The resulting pattern with a large negative shock
to realized returns followed by a sequence of slightly elevated returns is exactly what makes it
difficult to detect local return predictability in such models. Further, the more risk premia move,
the more volatile realized returns are likely to be, increasing estimation errors in local predictive
regression coefficients. A final concern is the Stambaugh bias because shocks to risk premia may

27
be correlated with innovations to the key regressors - an effect that can be particularly strong
at the higher (daily) frequency. These effects make local return predictability at high frequencies
extremely difficult to detect. Only at longer horizons, as the shock to the persistent risk premium
component has had time to build up, do we get more power to detect return predictability.

5.3 Simulation Results


Building on these observations, Table 9 shows simulation results for the four asset pricing models
using the unrestricted return predictions. For each performance measure listed in the rows, the
columns show the mean, standard error and p-value, the latter computed from the proportion
of simulations able to match the sample statistic which, for convenience, we list in the left-most
column. The top, middle and bottom panels report results for the three predictors generated as
part of the asset pricing models, namely the dp ratio, the risk-free rate, and the realized variance.
First consider the statistical performance as captured by the CW statistic. Across all three
predictors, all asset pricing models can match the full-sample and out-of-pocket accuracy of the
local kernel forecasts measured relative to the prevailing mean. None of the asset pricing models
get close to matching the in-pocket accuracy of the kernel regression forecasts, however, regardless
of which predictor is used.
Turning to the economic performance measures, the Campbell and Cochrane (1999), and
Gârleanu and Panageas (2015) models struggle to match the alphas found in the data. The Bansal
and Yaron (2004) and Wachter (2013) models are better able to match alphas for the predictive
return regressions that use the dp ratio but not so much for those that use the risk-free rate or the
realized variance predictors. None of the asset pricing models is able to match the alpha t-statistic
in the empirical data and they only match the Sharpe ratio for the models that use the dp predictor
but not for the other predictors.36
Taking stock, these results suggest that the presence of local return predictability pockets poses
a challenge in the sense that such patterns cannot be generated by a range of dynamic asset pricing
models spanning a wide spectrum of modeling assumptions. One might suspect that this is due
to the omission, by such models, of complicating factors such as time-varying heteroskedasticity
or highly persistent predictors whose innovations are correlated with shocks to the return process.
However, this is unlikely to be the explanation here since our earlier simulations of three statistical
models incorporated such features and found that they could not produce return patterns that
match the local return predictability pockets we find in the data.
It is important to emphasize that we do not preclude the possibility that asset pricing models
with rational expectations can generate pockets of return predictability. For instance, one could
introduce a moderately persistent variable, st , which affects risk premia and risk free rates by
offsetting amounts, thus preserving a signal which is potentially useful and avoids the problem
36
Appendix Tables A.15 and A.16 show that similar results hold for the return predictions that impose constraints
on the sign of the excess return forecasts or restrict the signs of the slope estimates.

28
of offsetting noise. Specifically, a predictor such as the risk free interest rate could be a linear
combination of low and high frequency components, in which case the projection of returns onto
the predictor may be time-varying. Constructing such a model falls outside the scope of our current
paper, however, and is left for future research.37

6 Sticky Expectations and Pockets of Predictability


In the previous section, we argued that our empirical findings of local return predictability pockets
posed challenges to a number of workhorse asset pricing models with time-varying risk premia.
In this section, motivated by a rapidly growing literature at the intersection of macroeconomics
and finance, we propose a model featuring sluggish adjustment of beliefs in the spirit of “sticky
information” models (Mankiw and Reis, 2002; Woodford, 2003; Sims, 2003; Coibion and Gorod-
nichenko, 2015), as well as departures from market efficiency reflecting tendencies of certain types
of information to be incorporated slowly into asset prices.
Our claim is not that a model with sticky expectations is the only, or even most plausible, way
to generate return predictability pockets. However, there are intuitive reasons why we would expect
sticky expectations models to be easier to reconcile with return predictability pockets. Compared
to a setup with rational expectations, sticky expectations models reduce the spikiness in asset prices
after a large shock to the true growth rate of cash flows (which leads to a predictable drift in realized
returns). Instead, the change in price levels is roughly zero on impact and will only gradually reflect
the change in valuations associated with using the correct cash flow growth rate. Further, sticky
expectations can introduce a wedge between agents’ expectations and the true conditional mean
of the cash flow growth rate process. We show that this wedge is correlated with observable state
variables in the sticky expectation model and that, as this wedge cumulates over time, these state
variables can be used in simple univariate regression models to identify local return predictability.

6.1 Present value model with sticky expectations


Following a modeling approach analogous to Bouchaud et al. (2019) and Gomez Cram (2021), our
starting point is a standard log-linearized present value model of asset prices. We first specify
37
Time variation in intermediaries’ net worth is another possible source of local return predictability since this
could explain why local return predictability is not arbitraged away in states with only limited access to arbitrage
capital. To explore this issue further, we conducted simulations from the asset pricing model proposed by Di Tella
(2017) which emphasizes intermediaries’ balance sheets in a model of optimal risk sharing between intermediaries
and households and provides a mechanism for generating time-varying risk premia. We found that this model yielded
results similar to those from the other asset pricing models and does not generate pockets consistent with what we
see in the data. While the key state variables in the model governing risk premia are somewhat less persistent in
this framework relative to some of the other models we consider, ultimately, we find similar results to Table 9. The
key variables fluctuate at business cycle frequencies, making it difficult to detect pockets of predictability in time
to exploit them meaningfully out-of- sample via our local kernel approach. Given the similarity with our existing
results, we omit these results for brevity.

29
the behavior of cash flows, then agents’ beliefs and subjective discount rates. Dividends evolve
according to the following law of motion under the objective probability distribution:

∆dt+1 = µd + zcf,t + d,t+1 , (17)


zcf,t+1 = ρcf zcf,t + cf,t+1 . (18)

Consistent with the reduced form representation proposed by Bouchaud et al. (2019) and Coibion
and Gorodnichenko (2015), agents have sticky expectations in the spirit of Mankiw and Reis (2002).
Letting Ft denote conditional expectations under agents’ subjective beliefs at time t, sticky expec-
tations are captured through

Ft [∆dt+1+h ] = µd + (1 − λ)Et [zcf,t+h ] + λFt−1 [∆dt+1+h − µd ]


= µd + (1 − λ)ρhcf zcf,t + λρhcf Ft−1 [∆dt+1 − µd ]. (19)

The basic intuition captured by these models is that agents’ beliefs about macroeconomic funda-
mentals are somewhat slow in incorporating new information. Forecasts, even those of professional
economists, are therefore subject to predictable biases.
The state variable zcf,t captures a persistent shifter of expected cash flow growth, which is not
necessarily observable by agents in the model. Given the substantial debate about the extent to
which cash flows are predictable at medium to long horizons (Cochrane, 2008), it seems plausible
that a difficult-to-estimate variable like expected cash flow growth for the aggregate stock market
might be subject to information rigidities. We will allow such a possibility in the model below, and
discipline the magnitude of rigidities on estimates from microdata. For parsimony, we assume that
agents have rational expectations about all remaining state variables in the model.
Coibion and Gorodnichenko (2015) show that a specification for beliefs like equation (19) obtains
from two distinct microfoundations. The first is a sticky expectations model in which a measure
1 − λ of agents update their beliefs about the relevant variable each period. The second is a
setting in which zcf,t+1 is unobserved but agents individually observe noisy signals about the state
variable and update beliefs using the Kalman filter. In such a case, consensus expectations update
as a weighted average of the prior and the new signal.38 The parameter λ captures the degree of
sluggishness in the extent to which agents’ expectations update to reflect new information about
expected macroeconomic fundamentals embedded in the cash flow shock cf,t . Rational expectations
are nested as a special case of (19) when λ = 0; stickiness increases as λ rises above zero.
To incorporate additional asset pricing dynamics, we introduce exogenous shifters of subjective
risk premia and risk-free rates which (for simplicity) are known, not subject to information rigidities,
38
Such a direct interpretation in this context requires that agents do not extract information from common signals
such as consensus forecasts and/or prices (as is assumed to be the case for a subset of agents in the model of Hong
and Stein, 1999). See also Barberis, Shleifer and Vishny (1998) and Daniel, Hirshleifer and Subrahmanyam (1998).

30
and follow the following laws of motion:

Ft [rt+1 − rf,t+1 ] = µrp + zdr,t (20)


zdr,t+1 = ρdr zdr,t + dr,t+1 (21)
rf,t+1 = µrf + β rf,dr zdr,t + β rf,cf Ft [∆dt+1 − µd ] + ztp,t (22)
ztp,t+1 = ρtp ztp,t + tp,t+1 . (23)

Here zdr,t allows for a “standard” risk premium channel and follows a homoskedastic AR(1) pro-
cess. The AR(1) state variable, ztp,t , allows for additional variables (e.g., time preference shocks)
capturing variation in the risk-free rate which is independent from expected cash flows and dis-
count rates. These variables generate independent variation in valuations, realized returns, and the
risk-free rate. We allow the risk-free interest rate to load on all three state variables, zdr,t , ztp,t ,
and subjective expected cash flow growth.
Similar to Katz, Lustig and Nielsen (2017) and Bouchaud et al. (2019), we assume that asset
prices satisfy an approximate present value identity under agents’ beliefs.39 We start with the
familiar log-linearized present value model:

rt+1 ≈ k + ρ(pt+1 − dt+1 ) + ∆dt+1 + (dt − pt ). (24)

Iterating on this approximate accounting identity and take expectations under agents’ subjective
beliefs yields the present value pricing formula:
 

k X
pt − dt = + Ft  ρj [∆dt+1+j − rt+1+j ] . (25)
1−ρ
j=0

As is well known in this literature, assuming a pricing formula such as (25) is not immediate and
involves a departure from full rationality since agents fail to fully incorporate signals – such as
information obtainable from local kernel regressions and equilibrium – which could be used to yield
more accurate forecasts of expected returns and cash flows.40
39
See also De La O and Myers (2021) and Gomez Cram (2021), who make the same assumption.
40
Note that we are implicitly assuming that asset prices reflect “consensus” expectations about cash flows of a
set of behavioral agents. As noted by Bouchaud et al. (2019), one could potentially introduce a more complicated
equilibrium involving interactions between boundedly rational agents with sticky expectations and more sophisticated
agents with more accurate beliefs but capital constraints. Consistent with their approach, we also do not pursue
such an extension here, but conjecture that it would likely result in similar qualitative dynamics as our simpler
specification, albeit attenuated quantitatively towards the rational expectations benchmark. Further, given that our
model features a distortion in beliefs about aggregate cash flows, any strategy of the sophisticated agents would be
impossible to implement without facing an exposure to substantial nondiversifable risk. See also Angeletos and Huo
(2021) for a more explicit treatment of these issues in a related class of models.

31
Under these assumptions, we obtain by direct computation a valuation formula

k µd − µrf − µrp 1 − β rf,cf 1 + β rf,dr 1


pt −dt = + + Ft [∆dt+1 −µd ]− zdr,t − ztp,t , (26)
1−ρ 1−ρ 1 − ρ · ρcf 1 − ρ · ρdr 1 − ρ · ρtp

which we can use to simulate returns under the objective law of motion given the state variables.

6.2 Subjective and objective return predictability


Next, we discuss sources of return predictability in the sticky expectations model. Supposing that
all state variables were observed, the expected excess return under the objective measure satisfies

(1 − β rf,cf )ρρcf (1 − λ)
 
Et [rt+1 − rf,t+1 ] = µr + zdr,t + 1 + (zcf,t − Ft [∆dt+1 − µd ]) . (27)
1 − ρ · ρcf

The first term zdr,t captures a standard component associated with agents’ subjective risk premium,
as in a standard present value model. Whenever agents do not have rational expectations (λ 6= 0),
there is a second term capturing a wedge between the objective forecast an econometrician would
make if zcf,t was known and the agent’s forecast of the risk premium, which is the sum of two
components. First, if zcf,t exceeds agents’ subjective expectation of dividend growth, cash flows
will tend to surprise in the positive direction. Second, as beliefs about future growth rates gradually
mean-revert towards the true expectation (agents become more optimistic), the price-dividend ratio
will also continue to drift upwards.41
By iterative substitution of the state dynamics above, we obtain a Wold decomposition for the
difference between subjective and objective expectations of dividend growth:
∞   ∞
ρjcf ρjcf (1 ρjcf λj+1 cf,t−j . (28)
X X
j+1
zcf,t − Ft [∆dt+1 − µd ] = − −λ ) cf,t−j =
j=0 |{z} | {z } j=0
rational expectations sticky expectations
M A(∞) coefficient M A(∞) coefficient

This term is an exponentially-weighted moving average of recent shocks to expected cash flow
growth which reflects the sluggish response of beliefs to persistent cash flow information. Estimates
of λ from the literature suggest that sluggishness of beliefs is considerably lower than persistence of
expected macroeconomic growth rates. This means that the leading term is λj+1 and so this term
depends mostly on fairly recent shocks, adding a high-frequency component to expected returns.
Return innovations relative to subjective risk premia (rt+1 − µr − zdr,t ) therefore display “local
momentum”. Each return is a noisy signal of the geometric sum in equation (28), so returns will
tend to positively comove (even after netting out subjective risk premia) at short horizons. This
41
In our calibration, 1 − λ is larger than 1 − ρρcf and (1 − β rf,cf )ρ · ρcf is a bit smaller than 1, so the second term
can potentially be quite large (around six in the current calibration).

32
point can be made more formally by writing the model in state space form:

zcf,t+1 − Ft+1 [∆dt+2 − µd ] ≡ ϑt+1 = ρcf λ ϑt + λ cf,t+1 ,


(1 − β rf,cf )ρρcf (1 − λ)
 
rt+1 − rf,t+1 = µr + zdr,t + 1 + ϑt + ut+1 , (29)
1 − ρ · ρcf

where the residuals ut+1 and cf,t+1 have a modest positive correlation since pdt+1 slightly responds
to the current cash flow shock cf,t+1 . Supposing for simplicity that zdr,t was observed, to a
fairly close approximation the Kalman filter will imply that an econometrian’s forecast of ϑt is
an exponentially weighted moving average of rt+1 − µr − zdr,t .42 High recent past returns signal
the likelihood that future returns will stay high over the near term. The constant term from our
local kernel regression involves a weighted moving average of recent data and thus captures similar
features. In our local kernel regressions, recent changes in state variables play a dual role: 1) they
may be correlated with the subjective risk premium zdr,t ; and 2) they also provide informative
signals on how beliefs about cash flows have changed in the recent past. Both forces combine to
allow the econometrician to constructively (though imperfectly) capture an estimate of ex-ante
expected returns which is detectable in real time, as we demonstrate below.43
Low-frequency movements in dpt reflect persistent variation in risk premia, but also expected
cash flow growth rates and real interest rates, whereas higher frequency movements also reflect
revisions in agents’ beliefs which were unanticipated by agents but lead to predictable movements
in valuation ratios. These factors also affect expected excess returns under the objective measure
with different signs: recent increases in dpt signal the likelihood of further upward drift over the near
term due to sticky expectations, whereas low-frequency changes in dpt are expected to gradually
mean revert downwards. Further, even though our model has homoskedastic shocks for simplicity,
since agents’ forecast errors include ϑt , realized variance of returns also provides a noisy signal about
the absolute value of ϑt . Thus, all of the predictive regressions we consider are misspecified due
to omitted variable bias coming from mismeasured predictors. This creates scope for benefits from
using multivariate forecasts and/or univariate forecast combinations to further improve performance
by controlling for more sources of omitted variable bias and averaging across different sources of
misspecification, respectively.
In principle, local return predictability can arise from both zdr,t and ϑt . In practice, the latter
channel turns out to be far more important than the former in our simulation exercises.44 The
rationale for why a “standard” time-varying risk premium channel does not go very far is quite
42
If we ignore the fact that ut+1 and cf,t+1 have a slight positive correlation, we have a standard Kalman filtering
problem. Given that T is quite large and ρcf λ is fairly far from one, the impact of initial conditions will rapidly
dissipate, and the Kalman gain converges to a constant.
43
Figure A.5 in Appendix G presents impulse responses from large shocks to zdr,t and zcf,t .
44
To see this more formally, we conduct experiments below where we first subtract zdr,t from returns before
conducting our out-of-sample experiments. Performance is quite similar, and actually slightly better after doing so,
a result which is sensible in light of our findings in the previous section.

33
similar to that discussed earlier when explaining the failure of conventional asset pricing models
to match our evidence. Low-frequency movements in state variables which are common over each
fitting window are approximately differenced out in the local regressions; in contrast, these effects
dominate in constant coefficient specifications. Given the high persistence of zdr,t and the substan-
tial negative correlation between return innovations and changes in zdr,t , the effects of estimation
error more than offset any small potential gains from timing the market using estimates of the
subjective risk premium due to a very low signal-to-noise ratio and substantial Stambaugh (1999)
bias. In contrast, ϑt is considerably less persistent and the correlations which modulate the degree
of Stambaugh bias are considerably weaker. Accordingly, there is more scope for our methods to
detect pockets of predictability in our sticky expectations framework below.
Specifically, as is clear from (29), expected excess returns under the objective probability mea-
sure are a linear combination of the slow moving subjective risk premium zdr,t and the much less
persistent belief discrepancy ϑt . In model simulations illustrated in Appendix Figure A.6, tak-
ing the risk-free rate as an example, we find that pockets of predictability are particularly likely
to occur shortly (2-3 months) after periods in which |ϑt | is large, i.e., periods in which a larger
fraction of expected excess return variation is explained by the high-frequency belief component.
In contrast, the state variable capturing the rational risk premium zdr,t is essentially uncorrelated
with the pocket dummy. Related to this, the time series correlation between the risk free rate and
ϑt as well as the predictive coefficient on the risk free rate both tend to be larger in absolute value
inside pockets. In other words, more of the variation in the state variables reflects changes in the
high-frequency component, which makes it easier to capture return predictability using our local
regressions.45

6.3 Quantitative Assessment


We next simulate from our calibrated model with sticky expectations and repeat our empirical
exercises using model-generated data. In these simulations, we carefully fix the parameters of the
sticky expectations model to match moments of the data such as the annualized sample means of
dividend growth, the risk free rate, and expected returns.
While Appendix H provides further details about how we calibrate our model, it is important
to note what is not targeted in these calibrations. The central stickiness parameter is fixed ex-ante
using the empirical estimates of Coibion and Gorodnichenko (2015) so that λ = 0.34/252 ≈ 0.981.
In other words, we deliberately fix the degree of information rigidity based on estimates from the
literature, and the asset pricing moments selected are fairly standard and, as such, not explicitly
tied to any evidence related to pockets of predictability. Therefore, we view our examination of the
45
In addition, the properties of expected returns and the covariance between returns and lagged predictors change
in a direction which is favorable for detecting predictability during periods where true expected cash flow growth
rates recently changed substantially (high |ϑt |). Our local kernel regression forecasts, by adapting to these changing
covariances, are able to detect a meaningful level of out-of-sample predictability.

34
model’s ability (or lack thereof) to match evidence related to pockets as a nontargeted validation
test of the model.
As additional points of comparison, we consider two alternative models. The first is a rational
expectations version of our model with the same true cash flow dynamics but no information
rigidities (λ = 0). The second is a rational expectations model whose parameters are recalibrated
with λ = 0. Since the effects of sticky expectations on unconditional asset pricing moments are
fairly modest, these recalibrated parameters are similar to those from our baseline model.
We summarize the results from these experiments in Table 10, using a format similar to earlier
with different columns corresponding to different asset pricing models. The first column includes
our benchmark sticky expectations model, while the next two columns include the two different
calibrations of analogous rational expectations models as described above. Each block of results
tabulates the CW statistics, computed overall, in pocket, and out of pocket, respectively, as well
as performance measures from our market timing regressions. The top three panels show results
for the individual predictor variable followed by results from the multivariate kernel specification
and the three forecast combinations.
In stark contrast to the asset pricing models considered in Table 9, as well as the rational
expectations versions of our model with similar cash flow and subjective discount rate dynamics,
the model with sticky expectations is capable of replicating a number of the patterns observed in the
data. Local predictive regressions are consistently capable of detecting meaningful out-of-sample
predictability, especially in-pocket, whereas they struggle outside of pockets. Across specifications,
CW t-statistics are consistently the highest (and higher than full sample coefficients) inside pockets,
though full sample t-statistics are somewhat higher than in the data. The latter feature likely
reflects the fact that shocks are Gaussian in the model, so the tendency to overfit large realized
return shocks in our simulated samples is more muted relative to the data.46 The bottom panels of
Table 10 illustrate that the multivariate kernel specification and forecast combinations also work
well, in line with the intuition discussed above, with combination forecasts further benefitting from
reductions in estimation errors due to overfitting.
The middle and right panels of Table 10 illustrate that the ability to detect pockets of pre-
dictability via our local kernel approach does not transfer to the calibrated models with rational
expectations. Analogously with the simulation exercises from the asset pricing models from Table
9, estimation error swamps any ability to reliably exploit information from our time-varying fore-
casts despite the fact that returns are predictable by zrp,t . This result obtains in part because our
state variables do not perfectly reveal zrp,t but the dominant force is parameter estimation error.
Consistent with our results on the CW statistics, simulated market timing regressions indicate
that an investor could meaningfully improve her Sharpe ratio by adjusting her weights on the
46
In the model, we could easily replicate these features by introducing jumps in zrp,t+1 , ztp,t+1 , and/or d,t+1 ,
though we elected not to introduce these extra parameters in the interest of parsimony. Relatedly, the absence of
large jumps likely reduces jumps in realized volatility and likely improves its performance as a predictor relative to
the empirical application.

35
market using our local kernel approach. These results obtain across all predictors we consider and
again we find that combination and multivariate approaches work well. However, while our timing
strategy generates nontrivial improvements in the Sharpe ratio, such a strategy remains subject to
considerable risk. In contrast, market-timing alpha estimates are consistently negative across all
specifications in the rational expectations models, despite the fact that the underlying models do
feature time-varying risk premia.
Further examination shows that the full-sample regression coefficients of excess returns on pd
and rf are both almost identical for the sticky and rational expectations models, suggesting that the
long run return predictability patterns are similar in these types of models. Accordingly, our results
are not incompatible with evidence already established with constant coefficient specifications.
Moreover, the overall degree of “mispricing” is fairly modest in this economy. To see this, we
can decompose V ar[pdt ] into the sum of three pieces, namely the variance of the price that would
obtain under the rational expectations beliefs V ar[pd∗t ], the variance of the difference between the
observed price-dividend ratio and this “correct” one V ar[pdt − pd∗t ], and two times the covariance
between the two terms 2 Cov[pd∗t , pdt − pd∗t ]. We find that

V ar[pdt ] V ar[pd∗t ] V ar[pdt − pd∗t ] 2 Cov[pd∗t , pdt − pd∗t ]


1= = + + . (30)
V ar[pdt ] V ar[pdt ] V ar[pdt ] V ar[pdt ]
| {z } | {z } | {z }
1.0261 in model 0.0098 in model −0.0358 in model

The observed price-dividend ratio thus tracks the “true” one fairly closely overall. The variance
of the true price-dividend component pd∗t is more than 100 times larger than the variance of the
“pricing error” pdt − pd∗t component, which indicates that these two variables are quite similar at
low frequencies. However, the two can deviate by nontrivial amounts at higher frequencies.
Intriguingly, one might have thought that there is a tension between the evidence which suggests
that return predictability is elusive, almost nonexistent, and/or fragile at high frequencies and the
evidence/theoretical work on fundamental drivers of fluctuations in asset prices at lower frequencies.
Our model suggests that this is not necessarily the case. Small, high-frequency discrepancies in
price levels related to behavioral biases/information frictions can inject considerable noise into short
horizon risk premium estimates without invalidating the insights we glean about predictability at
longer horizons from models with rational, low-frequency fluctuations in risk premia.
Finally, while we did not explicitly introduce a cross-section of different assets to be priced here,
an extension to different assets with cash flow growth rates that load differentially on our aggregate
state variables is straightforward. While we have not performed a quantitative assessment, such
an extension can easily match the market timing results which we obtained for the size and value
portfolios in Table 8 qualitatively. This occurs because the sticky expectations model quite naturally
generates factor momentum, which is an important component of the overall return to momentum
strategies (Moskowitz, Ooi and Pedersen, 2012; Ehsani and Linnainmaa, 2021). In our sticky
expectations model, due to the sluggish incorporation of news about fundamentals into prices,

36
stocks (and factors) whose prices have recently increased are likely to continue to drift upwards.
Intriguingly, Ehsani and Linnainmaa (2021) find that factor momentum is particularly concentrated
in factors that explain the largest share of variation in the cross-section of realized returns–i.e., in
portfolios which contain substantial macro information. Thus, sluggish incorporation of macro
news into agents’ information sets could plausibly be connected to patterns of factor momentum
in the data.47

6.4 Direct Evidence on the Mechanism


Finally, we provide some direct evidence which links the expected return forecasts used in our
market timing strategy and measures of biases in the beliefs of professional forecasters. Specifically,
we use the original data considered by Coibion and Gorodnichenko (2015), which include a measure
of the forecast errors made by forecasters in various longitudinal surveys.48 Consistent with the
analysis in Section III of their paper, we focus on the quarterly subsample of forecasts from the
Survey of Professional Forecasters (SPF). Specifically, Coibion and Gorodnichenko (2015) compute
xq+h − Ftq [xq+h ] for several macroeconomic variables xq and at various forecast horizons h ≥ 0,
where Ftq [xq+h ] is the consensus forecast for quarter q as of date tq .49
Under rational expectations, forecast errors as defined above should be orthogonal to any in-
formation which was available as of time tq . Given that the information contained in our expected
return forecasts from prior to tq would have been available by the time at which the survey was con-
ducted, our forecasts should be uncorrelated with these forecast errors. In our theoretical model,
these forecast errors would map into unexpected cash flow shocks and, in turn, realized return
surprises from the perspective of agents with sticky expectations. Since direct forecasts of dividend
growth are not available, we consider three choices for the variable x, all of which capture infor-
mation about the business cycle: real GDP growth gy, the unemployment rate ue (in percentage
points), and real industrial production growth ip. Under sticky expectations, we would expect to
see a positive correlation between our return forecasts and forecast errors in procyclical variables
like gy and ip and a negative correlation with the countercyclical variable ue.
For each variable and each consensus forecast date, we compute an average of quarterly forecast
47
Moreover, to the extent that sluggish incorporation of information, especially aggregate information, into beliefs
is a general feature of how agents process information about future aggregate payoffs, it is somewhat less surprising
to find that momentum appears across a wide variety of asset classes (Asness, Moskowitz and Pedersen, 2013) and
that momentum strategies might co-move. Provided that recent winners include stocks who load disproportionately
on macroeconomic factors about which agents were revising beliefs most aggressively, they will also tend to fall the
most if these revisions in beliefs turned out to be incorrect. Such a phenomenon could generate momentum crashes
around business cycle turning points (Daniel and Moskowitz, 2016).
48
Data are available from https://www.aeaweb.org/articles?id=10.1257/aer.20110306.
49
Note that h = 0 corresponds to a “nowcast” of a quarterly variable produced during the middle of the current
quarter, the time at which the survey is administered.

37
errors at multiple horizons h ∈ {0, 4} as follows

4
X
1
ε̂x,tq ≡ 5 xq+h − Ftq xq+h , (31)
h=0

where tq refers to the time at which the forecast is formed, and Ftq xq+h are h−period-ahead forecasts
from the SPF formed at time tq of the quarterly variable x. We then study the correlation between
ex-ante return forecasts from our time-varying coefficient models and ε̂x,tq , using forecasts from
both the univariate and multivariate models that switch to the prevailing mean forecast outside of
ex-ante identified pockets. We convert Coibion-Gorodnichenko forecast errors to a daily frequency
by setting ε̂t = ε̂tq for all days t in quarter q. Respondents to the SPF send in their forecasts around
the middle of each quarter, so to avoid possible look-ahead bias we only use return forecasts from
the first month of each quarter when estimating these correlations.50 We then estimate correlations
between r̂t|t−1 and ε̂t and compute Newey-West standard errors using a rule-of-thumb bandwidth.
Our estimated correlations and 95% standard error bands are reported in Figure 3, which
contains three groupings of ten bars. Each grouping corresponds to the forecast errors of one of the
variables from Coibion and Gorodnichenko (2015). Each of the nine bars corresponds to a model for
forecasting excess returns and the height of each bar corresponds to the correlation between these
forecasts and the forecast errors. Consistent with our proposed sticky expectations mechanism, we
find a robust empirical link between our expected return forecasts and future forecast errors. For
instance, forecasts based on the T-bill rate have a correlation of around 50% with future forecast
errors. While signs are consistent across all specifications, the correlations are weaker for the
multivariate kernel model and stronger for the three forecast combinations.
As a final observation, pockets tend to be periods of time in which ϑt is large in absolute
value. Since both rt and rt−1 have components which are linear in ϑt and ϑt−1 , respectively,
autocorrelation tends to be larger inside versus outside of pockets. We see exactly this pattern in
Appendix Table A.1, especially for predictors other than dp.
In conclusion, whereas the models with rational expectations do not match our evidence related
to short-horizon return predictability, a simple model with sticky expectations can account for
our evidence both qualitatively and quantitatively. Further, the expectations data in this setting
provide direct evidence showing that our ex-ante return forecasts explain a non-trivial amount of
predictable variation in professional forecasters’ expectation errors.
50
As an additional robustness check, we lead the forecast errors by one additional quarter and repeat the analysis.
These correlations, which are reported in Appendix Figure A.7, are similar in terms of signs and statistical significance
but are somewhat attenuated towards zero.

38
7 Conclusion
We develop a nonparametric kernel regression approach to detect pockets with local predictability of
stock returns. Our out-of-sample approach uses real-time information to monitor for improvements
in the accuracy of return forecasts from the local kernel regression model relative to a benchmark
no-predictability model. Empirically, we find evidence that while stock returns are unpredictable
the vast majority of time, there are relatively short-lived pockets in which stock returns can be
predicted. Moreover, such out-of-sample return predictability is sufficiently large to be exploitable
for economic gains, particularly if used in conjunction with economic constraints on the return
forecasts or forecast combination methods that incorporate information on which models identify
local pockets at a given point in time.
To explore possible sources of return predictability, we simulate returns from a range of sta-
tistical models that incorporate features such as highly persistent predictors, time-varying het-
eroskedasticity, and Stambaugh (1999) bias. We also simulate returns from a set of workhorse asset
pricing models representative of the dynamics of returns and state variables consistent with time-
varying risk premia. Both types of models fail to match the empirical evidence of in-pocket return
predictability and its implications for the investment performance of a simple dynamic trading
strategy set up to exploit pockets with return predictability.
Building on recent papers such as Bouchaud et al. (2019), we finally develop a simple asset
pricing model in which agents have sticky expectations about future cash flow growth. Our model,
which nests rational expectations as a special case, allows for a wedge to form between agents’ sub-
jective expectations and forecasts computed under the true cash flow process. For some sequences
of shocks to the underlying state variables, this gives rise to local return predictability. We show
how this can be captured through familiar state variables such as the dividend-price ratio, the
risk-free rate and realized return volatility and also demonstrate why strategies such as forecast
combination can be expected to improve forecast accuracy as has been documented in studies such
as Rapach, Strauss and Zhou (2010).

References
Adrian, T., E. Etula and T. Muir. 2014. “Financial Intermediaries and the Cross-Section of Asset
Returns.” Journal of Finance 69(6):2557–2596.

Ang, A. and G. Bekaert. 2007. “Stock Return Predictability: Is it There?” Review of Financial
Studies 20(3):651–707.

Ang, Andrew and Dennis Kristensen. 2012. “Testing conditional factor models.” Journal of Finan-
cial Economics 106(1):132–156.

39
Angeletos, George-Marios and Zhen Huo. 2021. “Myopia and anchoring.” American Economic
Review 111(4):1166–1200.

Angeletos, George-Marios, Zhen Huo and Karthik A Sastry. 2021. “Imperfect macroeconomic
expectations: Evidence and theory.” NBER Macroeconomics Annual 35(1):1–86.

Asness, Clifford S, Tobias J Moskowitz and Lasse Heje Pedersen. 2013. “Value and momentum
everywhere.” The Journal of Finance 68(3):929–985.

Baker, M. and J. Wurgler. 2006. “Investor sentiment and the cross-section of stock returns.” Journal
of Finance 61(4):1645–1680.

Baker, M. and J. Wurgler. 2007. “Investor sentiment in the stock market.” Journal of Economic
Perspectives 21(2):129–152.

Bansal, R. and A. Yaron. 2004. “Risks for the long run: A potential resolution of asset pricing
puzzles.” The Journal of Finance 59(4):1481–1509.

Bansal, R., D. Kiku and A. Yaron. 2012. “An Empirical Evaluation of the Long-Run Risks Model
for Asset Prices.” Critical Finance Review 1(1):183–221.

Barberis, Nicholas, Andrei Shleifer and Robert Vishny. 1998. “A model of investor sentiment.”
Journal of financial economics 49(3):307–343.

Bekaert, G. and E. Engstrom. 2017. “Asset Return Dynamics under Habits and Bad Environment
Good Environment Fundamentals.” Journal of Political Economy 125(3):713–760.

Bordalo, Pedro, Nicola Gennaioli, Yueran Ma and Andrei Shleifer. 2020. “Overreaction in macroe-
conomic expectations.” American Economic Review 110(9):2748–82.

Bouchaud, Jean-Philippe, Philipp Krueger, Augustin Landier and David Thesmar. 2019. “Sticky
expectations and the profitability anomaly.” The Journal of Finance 74(2):639–674.

Cai, Z. 2007. “Trending time-varying coefficient time series models with serially correlated errors.”
Journal of Econometrics 136(1):163–188.

Campbell, J. Y. 1987. “Stock returns and the term structure.” Journal of Financial Economics
18(2):373–399.

Campbell, J. Y. and J. H. Cochrane. 1999. “By force of habit: A consumption-based explanation


of aggregate stock market behavior.” Journal of Political Economy 107(2):205–251.

Campbell, J. Y. and R. J. Shiller. 1988. “The dividend-price ratio and expectations of future
dividends and discount factors.” Review of Financial Studies 1(3):195–228.

40
Campbell, J. Y. and S. B. Thompson. 2008. “Predicting excess stock returns out of sample: Can
anything beat the historical average?” Review of Financial Studies 21(4):1509–1531.

Campbell, John Y, Andrew W Lo and A Craig MacKinlay. 1997. 7. Present-Value Relations. In


The Econometrics of Financial Markets. Princeton University Press pp. 253–290.

Chen, B. and Y. Hong. 2012. “Testing for smooth structural changes in time series models via
nonparametric regression.” Econometrica 80(3):1157–1183.

Chen, Yu, Thomas F. Cosimano, Alex A. Himonas and Peter Kelly. 2009. “Asset pricing with long-
run risk and stochastic differential utility: An analytic approach.” Available at SSRN 1502968.

Clark, T. E. and K. D. West. 2007. “Approximately normal tests for equal predictive accuracy in
nested models.” Journal of Econometrics 138(1):291–311.

Clark, Todd E and Michael W McCracken. 2001. “Tests of equal forecast accuracy and encompass-
ing for nested models.” Journal of econometrics 105(1):85–110.

Cochrane, John H. 2008. “The dog that did not bark: A defense of return predictability.” The
Review of Financial Studies 21(4):1533–1575.

Coibion, Olivier and Yuriy Gorodnichenko. 2015. “Information rigidity and the expectations forma-
tion process: A simple framework and new facts.” American Economic Review 105(8):2644–78.

Constantinides, G. M. and A. Ghosh. 2017. “Asset Pricing with Countercyclical Household Con-
sumption Risk.” Journal of Finance 72(1):415–460.

Constantinides, G. M. and D. Duffie. 1996. “Asset Pricing with Heterogeneous Consumers.” Journal
of Political Economy 104(2):219–240.

Creal, D. D. and J. C. Wu. 2016. Bond Risk Premia in Consumption-based Models. NBER Working
Paper 22183.

Dangl, T. and M. Halling. 2012. “Predictive regressions with time-varying coefficients.” Journal of
Financial Economics 106(1):157–181.

Daniel, Kent, David Hirshleifer and Avanidhar Subrahmanyam. 1998. “Investor psychology and
security market under-and overreactions.” Journal of Finance 53(6):1839–1885.

Daniel, Kent and Tobias J Moskowitz. 2016. “Momentum crashes.” Journal of Financial Economics
122(2):221–247.

d’Arienzo, Daniele. 2020. Increasing Overreaction and Excess Volatility of Long Rates. Technical
report Working Paper.

41
De La O, Ricardo and Sean Myers. 2021. “Subjective cash flow and discount rate expectations.”
The Journal of Finance 76(3):1339–1387.

Di Tella, Sebastian. 2017. “Uncertainty shocks and balance sheet recessions.” Journal of Political
Economy 125(6):2038–2081.

Diebold, F. X. and R. S. Mariano. 1995. “Comparing Predictive Accuracy.” Journal of Business


and Economic Statistics 13(3):253–263.

Drechsler, I. and A. Yaron. 2011. “What’s Vol Got to Do with It.” Review of Financial Studies
24(1):1–45.

Duffie, Darrell and Larry G Epstein. 1992. “Stochastic differential utility.” Econometrica: Journal
of the Econometric Society pp. 353–394.

Ehsani, Sina and Juhani T Linnainmaa. 2021. “Factor Momentum and the Momentum Factor.”
Journal of Finance .

Epstein, L. G. and S. E. Zin. 1989. “Substitution, Risk Aversion, and the Temporal Behavior of
Consumption and Asset Returns: A Theoretical Framework.” Econometrica 57(4):937–969.

Eraker, B. and I. Shaliastovich. 2008. “An Equilibrium Guide to Designing Affine Asset Pricing
Models.” Mathematical Finance 18(4):519–543.

Fama, E. F. and K. R. French. 1988. “Dividend yields and expected stock returns.” Journal of
Financial Economics 22(1):3–25.

Fama, E. F. and K. R. French. 1989. “Business conditions and expected returns on stocks and
bonds.” Journal of Financial Economics 25(1):23–49.

Gârleanu, Nicolae and Stavros Panageas. 2015. “Young, old, conservative, and bold: The implica-
tions of heterogeneity and finite lives for asset pricing.” Journal of Political Economy 123(3):670–
685.

Giglio, Stefano and Bryan Kelly. 2018. “Excess volatility: Beyond discount rates.” The Quarterly
Journal of Economics 133(1):71–127.

Gomez Cram, Roberto. 2021. “Late to Recessions: Stocks and the Business Cycle.” Working Paper.

Green, J., R. M. Hand and M. T. Soliman. 2011. “Going, Going, Gone? The Apparent Demise of
the Accruals Anomaly.” Management Science 57(5):797–816.

Hansen, L. P., J. C. Heaton and N. Li. 2008. “Consumption Strikes Back? Measuring Long-Run
Risk.” Journal of Political Economy 116(2):260–302.

42
Hansen, Lars Peter, Paymon Khorrami and Fabrice Tourre. 2018. “Comparative Valuation Dy-
namics in Models with Financing Restrictions.” Working Paper.

Henkel, S. J., J. S. Martin and F. Nardari. 2011. “Time-varying short-horizon predictability.”


Journal of Financial Economics 99(3):560–580.

Herskovic, B., B. Kelly, H. Lustig and S. van Nieuwerburgh. 2016. “The common factor in id-
iosyncratic volatility: Quantitative asset pricing implications.” Journal of Financial Economics
119(2):249–283.

Hong, Harrison and Jeremy C Stein. 1999. “A unified theory of underreaction, momentum trading,
and overreaction in asset markets.” The Journal of finance 54(6):2143–2184.

Hong, Harrison, Terence Lim and Jeremy C Stein. 2000. “Bad news travels slowly: Size, analyst
coverage, and the profitability of momentum strategies.” The Journal of Finance 55(1):265–295.

Hong, Harrison, Walter Torous and Rossen Valkanov. 2007. “Do industries lead stock markets?”
Journal of Financial Economics 83(2):367–396.

Hou, Kewei. 2007. “Industry information diffusion and the lead-lag effect in stock returns.” The
Review of Financial Studies 20(4):1113–1138.

Johannes, M., A. Korteweg and N. Polson. 2014. “Sequential learning, predictive regressions, and
optimal portfolio returns.” Journal of Finance 69(2):611–644.

Katz, Michael, Hanno Lustig and Lars Nielsen. 2017. “Are stocks real assets? sticky discount rates
in stock markets.” The Review of Financial Studies 30(2):539–587.

Keim, D. B. and R. F. Stambaugh. 1986. “Predicting returns in the stock and bond markets.”
Journal of Financial Economics 17(2):357–390.

Kelly, B. and S. Pruitt. 2013. “Market expectations in the cross-section of present values.” Journal
of Finance 68(5):1721–1756.

Lettau, M. and S. C. Ludvigson. 2010. “Measuring and modeling variation in the risk-return
trade-off.” Handbook of Financial Econometrics 1:617–690.

Lettau, M. and S. van Nieuwerburgh. 2008. “Reconciling the return predictability evidence.” Review
of Financial Studies 21(4):1607–1652.

Lustig, H., S. van Nieuwerburgh and A. Verdelhan. 2013. “The Wealth-Consumption Ratio.”
Review of Asset Pricing Studies 3(1):38–94.

43
Mankiw, N Gregory and Ricardo Reis. 2002. “Sticky information versus sticky prices: a proposal
to replace the New Keynesian Phillips curve.” The Quarterly Journal of Economics 117(4):1295–
1328.

McLean, R. D. and J. Pontiff. 2016. “Does Academic Research Destroy Stock Return Predictabil-
ity?” Journal of Finance 71(1):5–32.

Moreira, Alan and Tyler Muir. 2017. “Volatility-managed portfolios.” The Journal of Finance
72(4):1611–1644.

Moskowitz, Tobias J and Mark Grinblatt. 1999. “Do industries explain momentum?” The Journal
of finance 54(4):1249–1290.

Moskowitz, Tobias J, Yao Hua Ooi and Lasse Heje Pedersen. 2012. “Time series momentum.”
Journal of Financial Economics 104(2):228–250.

Paye, B. S. and A. Timmermann. 2006. “Instability of return prediction models.” Journal of


Empirical Finance 13(3):274–315.

Pesaran, M. H. and A. Timmermann. 1995. “Predictability of Stock Returns: Robustness and


Economic Significance.” Journal of Finance 50(4):1201–1228.

Pettenuzzo, D., A. Timmermann and R. Valkanov. 2014. “Forecasting stock returns under economic
constraints.” Journal of Financial Economics 114(3):517–553.

Pettenuzzo, Davide, Riccardo Sabbatucci and Allan Timmermann. 2020. “Cash Flow News and
Stock Price Dynamics.” The Journal of Finance 75(4):2221–2270.

Politis, Dimitris N and Halbert White. 2004. “Automatic block-length selection for the dependent
bootstrap.” Econometric reviews 23(1):53–70.

Rapach, D. E. and G. Zhou. 2013. “Forecasting stock returns.” Handbook of economic forecasting
2:328–383.

Rapach, D. E., J. K. Strauss and G. Zhou. 2010. “Out-of-sample equity premium prediction:
Combination forecasts and links to the real economy.” Review of Financial Studies 23(2):821–
862.

Rapach, D. E. and M. E. Wohar. 2006. “Structural breaks and predictive regression models of
aggregate U.S. stock returns.” Journal of Financial Econometrics 4(2):238–274.

Robinson, P. M. 1989. Nonparametric estimation of time-varying parameters. In Statistical Analysis


and Forecasting of Economic Structural Change. Springer pp. 253–264.

44
Schmidt, L. 2016. Climbing and Falling Off the Ladder: Asset Pricing Implications of Labor Market
Event Risk. SSRN Scholarly Paper ID 2471342.

Schorfheide, Frank, Dongho Song and Amir Yaron. 2018. “Identifying long-run risks: A Bayesian
mixed-frequency approach.” Econometrica 86(2):617–654.

Schwert, G. W. 2003. “Anomalies and market efficiency.” Handbook of the Economics of Finance .

Sims, Christopher A. 2003. “Implications of rational inattention.” Journal of monetary Economics


50(3):665–690.

Stambaugh, Robert F. 1999. “Predictive regressions.” Journal of Financial Economics 54(3):375–


421.

Timmermann, A. 2008. “Elusive Return Predictability.” International Journal of Forecasting


24(1):1–18.

Timmermann, Allan. 2006. “Forecast combinations.” Handbook of economic forecasting 1:135–196.

van Binsbergen, J. H. and R. S. J. Koijen. 2010. “Predictive regressions: A present value approach.”
Journal of Finance 65(4):1439–1471.

Wachter, Jessica A. 2005. “Solving models with external habit.” Finance Research Letters 2(4):210–
226.

Wachter, Jessica A. 2013. “Can time-varying risk of rare disasters explain aggregate stock market
volatility?” The Journal of Finance 68(3):987–1035.

Wang, Chen. 2020. “Under-and over-reaction in yield curve expectations.” Working Paper .

Welch, I. and A. Goyal. 2008. “A comprehensive look at the empirical performance of equity
premium prediction.” Review of Financial Studies 21(4):1455–1508.

Woodford, Michael. 2003. “Imperfect Common Knowledge and the Effects of Monetary Policy.”
Knowledge, Information, and Expectations in Modern Macroeconomics: In Honor of Edmund S.
Phelps p. 25.

45
2
Variables Slope coefficient t-statistic R (in %) No. of obs.
Panel A: Full sample
dp 0.025 1.14 0.005 23,786
tbl -0.007 -2.78 0.053 15,860
tsp 0.017 2.31 0.041 13,846
rvar 6.4 × 10−5 0.54 4.3 × 10−4 23,727
Panel B: In-pocket
dp 0.084 2.55 0.18 3,483
tbl -0.014 -3.29 0.37 3,506
tsp 0.073 3.95 1.47 1,810
rvar 4.8 × 10−5 0.14 -0.02 4,841
Panel C: Out-of-pocket
dp 0.012 0.44 -0.004 18,943
tbl -0.003 -0.87 -0.002 10,994
tsp 0.006 0.75 -0.005 10,676
rvar 9.5 × 10−5 0.66 0.004 17,526

Table 1: Constant-coefficient regression results. This table reports slope coefficient estimates, t-statistics
2
(computed using Newey-West standard errors), and R values for univariate regressions of daily excess stock returns
on the lagged predictor variables listed in the rows. The three panels report results for three different sub-periods.
The first panel reports results for the full-sample, the second panel reports results for the concatenation of periods
determined to be pockets, and the third panel reports the results for the concatenation of all periods not classified as
pockets. The start dates for each series are: 11/5/1926 for the dividend price ratio (dp), 1/4/1954 for the 3-month
Treasury bill (tbl), 1/2/1962 for the term spread (tsp), and 1/15/1927 for the realized variance (rvar). All series run
through the end of 2016.

Daily Monthly
Statistics dp tbl tsp rvar dp tbl tsp rvar
Num pockets 18 12 7 16 15 15 10 18
Fraction of sample 0.15 0.24 0.15 0.22 0.17 0.21 0.12 0.19
Duration
Min 16 57 95 25 21 21 42 21
Mean 193.5 292.2 258.6 302.6 243.6 207.9 149.1 226.8
Max 610 672 501 1,302 714 588 378 735
Integral R2
Min -0.24 -0.24 0.28 -0.87 0.04 0.06 0.22 -0.29
Mean 1.51 3.70 2.92 2.77 1.79 2.21 1.59 1.48
Max 4.76 11.69 7.54 16.42 6.27 7.43 5.34 5.84

Table 2: Pocket statistics. This table reports statistics on the duration of pockets (in days) and the integral
R2 of pockets for pockets estimated with both daily and monthly data. Coefficients are estimated using a 1-sided
Kernel with a 2.5 year effective sample size and pockets are determined as periods where a fitted squared forecast
error differential (relative to a prevailing mean forecast and estiamted using a 1-sided Kernel with a 1 year effective
sample size) is above 0 in the preceding period.

46
Panel A: Clark-West statistics
Unrestricted + excess return forecasts All sign restrictions
Variables Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket
dp −0.74 3.00∗∗∗ −1.62† 0.40 3.79∗∗∗ −1.94†† 0.68 4.03∗∗∗ −1.84††
tbl 0.68 3.28∗∗∗ −1.58† 1.98∗∗ 4.75∗∗∗ −1.33† 2.03∗∗ 4.69∗∗∗ −1.21
tsp 0.15 3.04∗∗∗ −1.52† 0.95 4.52∗∗∗ −1.54† 0.79 4.18∗∗∗ −0.21
rvar −1.49† 2.88∗∗∗ −1.77†† −0.79 3.93∗∗∗ −1.07 −0.44 3.10∗∗∗ −0.97
mv −0.99 3.74∗∗∗ −1.49† −0.01 4.01∗∗∗ −1.22 −0.01 4.01∗∗∗ −1.22
pc 0.99 2.71∗∗∗ −0.56 1.85∗∗ 4.69∗∗∗ −0.52 1.85∗∗ 4.69∗∗∗ −0.52
comb1 4.48∗∗∗ 4.57∗∗∗ – 6.35∗∗∗ 6.50∗∗∗ – 6.20∗∗∗ 6.35∗∗∗ –
comb2 4.94∗∗∗ 5.04∗∗∗ – 6.15∗∗∗ 6.28∗∗∗ – 6.28∗∗∗ 6.44∗∗∗ –
comb3 −1.03 2.32∗∗ −2.12†† 0.23 0.74 −1.34† 0.66 2.59∗∗∗ −1.19
Panel B: Economic significance
Unrestricted + excess return forecasts All sign restrictions
Variables α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio
dp 1.69∗∗ 2.10 0.47 2.51∗∗∗ 2.89 0.54 2.95∗∗∗ 3.26 0.57
tbl 3.57∗∗∗ 4.35 0.79 6.48∗∗∗ 5.56 0.94 6.07∗∗∗ 5.37 0.92
tsp 3.14∗∗∗ 4.26 0.77 5.70∗∗∗ 4.95 0.85 5.04∗∗∗ 4.45 0.84
47

rvar 2.31∗∗∗ 3.63 0.68 2.89∗∗∗ 3.47 0.71 2.69∗∗∗ 3.03 0.54
mv 2.59∗∗∗ 3.37 0.64 4.79∗∗∗ 4.97 0.85 4.79∗∗∗ 4.97 0.85
pc 3.43∗∗∗ 3.97 0.73 5.87∗∗∗ 5.01 0.86 5.87∗∗∗ 5.01 0.86
comb1 6.38∗∗∗ 6.11 1.00 6.72∗∗∗ 6.71 1.05 6.69∗∗∗ 6.46 0.98
comb2 6.10∗∗∗ 5.66 0.87 8.53∗∗∗ 6.69 0.99 8.36∗∗∗ 6.51 0.95
comb3 0.76∗ 1.32 0.43 2.34∗∗ 1.87 0.46 2.72∗∗ 2.07 0.48
pm −0.25† −1.58 0.46 −0.25† −1.58 0.46 −0.25† −1.58 0.46

Table 3: Out-of-sample measures of forecasting performance (daily benchmark specification). Panel A reports the Clark and West (2007)
test statistics for out-of-sample return predictability measured relative to a prevailing mean forecast. Panel B reports 3 measures of economic significance
associated with returns on a portfolio which utilizes the time-varying coefficient model forecast in-pocket and the prevailing mean forecast out-of-pocket to
allocate between the risk-free asset and the market (portfolio weights are limited to be between 0 and 2): the annualized estimated alpha in percentage
points, the HAC t-statistic for the estimated alpha, and the annualized Sharpe Ratio of the portfolio. We use a purely backward-looking kernel with an
effective sample size of 2.5 years to compute forecasts. “pc” is a recursively computed first principal component of the four predictor variables. “mv” is
a four-variable multivariate forecast estimated using a product Kernel. “comb1,” “comb2,” and “comb3” refer to using a simple average of the univariate
forecasts. “comb1” sets an individual predictor’s forecast to the time-varying coefficient model forecast during a pocket and to the prevailing mean otherwise.
“comb2” is the same as “comb1” except that it ignores individual predictor forecasts when that variable is not in a pocket but at least one other variable
is in a pocket. “comb3” makes no distinction between in-pocket and out-of-pocket periods and always uses the simple equal-weighted average of all four
univariate models. The CW test statistics approximately follow a normal distribution with positive values indicating more accurate out-of-sample return
forecasts than the prevailing mean benchmark and negative values indicating the opposite. A pocket is classified as a period where a fitted squared forecast
error differential (estimated using a 1-sided Kernel with a 1-year effective sample size) is above 0 in the preceding period. Consider a particular statistic
of interest, β. ∗’s represent statistical significance at either the 10, 5, or 1% levels from a hypothesis test of β > 0. †’s represent statistical significance at
either the 10, 5, or 1% levels from a hypothesis test of β < 0.
Unrestricted + excess return forecasts All sign restrictions
i.i.d Block EGARCH i.i.d. Block EGARCH i.i.d. Block EGARCH
Statistics Actual Avg. p-val Avg. p-val Avg. p-val Actual Avg. p-val Avg. p-val Avg. p-val Actual Avg. p-val Avg. p-val Avg. p-val
dp
CWF S −0.74 −0.14 0.71 −0.22 0.69 −0.25 0.68 0.40 −0.04 0.31 −0.04 0.33 0.03 0.35 0.68 −0.03 0.23 −0.12 0.21 0.11 0.28
CWIP 3.00 −0.16 0.00 0.06 0.00 −0.28 0.00 3.79 −0.10 0.00 0.30 0.00 0.04 0.00 4.03 −0.05 0.00 0.20 0.00 0.01 0.00
CWOOP −1.62 −0.07 0.94 −0.36 0.90 −0.13 0.92 −1.94 −0.00 0.98 −0.36 0.95 −0.01 0.98 −1.84 −0.01 0.96 −0.32 0.94 0.12 0.98
α̂ 1.69 −0.23 0.05 0.35 0.11 0.14 0.04 2.51 −0.22 0.01 0.45 0.04 0.48 0.05 2.95 −0.27 0.00 0.24 0.01 0.40 0.03
tα̂ 2.10 −0.22 0.01 0.31 0.04 0.24 0.03 2.89 −0.19 0.00 0.35 0.00 0.37 0.01 3.26 −0.22 0.00 0.17 0.00 0.30 0.00
SR 0.47 0.44 0.37 0.44 0.39 0.49 0.57 0.54 0.43 0.15 0.44 0.17 0.49 0.35 0.57 0.43 0.07 0.44 0.12 0.49 0.25
tbl
CWF S 0.68 0.71 0.51 0.67 0.51 0.35 0.37 1.98 0.68 0.13 0.65 0.12 1.05 0.22 2.03 1.21 0.24 1.11 0.23 1.36 0.28
CWIP 3.28 0.53 0.01 0.53 0.01 0.27 0.01 4.75 0.55 0.00 0.50 0.00 0.95 0.02 4.69 0.87 0.00 0.80 0.00 1.03 0.01
CWOOP −1.58 0.45 0.98 0.39 0.97 0.18 0.95 −1.33 0.39 0.95 0.40 0.94 0.54 0.96 −1.21 0.83 0.97 0.78 0.96 0.91 0.98
α̂ 3.57 0.36 0.01 0.38 0.02 1.65 0.09 6.48 0.48 0.00 0.39 0.00 2.64 0.04 6.07 0.72 0.00 0.61 0.00 2.66 0.06
tα̂ 4.35 0.30 0.00 0.30 0.00 1.92 0.04 5.56 0.36 0.00 0.29 0.00 2.06 0.01 5.37 0.57 0.00 0.48 0.00 2.02 0.01
SR 0.79 0.55 0.18 0.53 0.14 0.77 0.45 0.94 0.53 0.07 0.53 0.06 0.77 0.26 0.92 0.53 0.07 0.53 0.06 0.76 0.28
tsp
CWF S 0.15 0.61 0.67 0.90 0.75 0.25 0.54 0.95 0.71 0.42 0.97 0.51 0.94 0.49 0.79 0.75 0.47 0.93 0.56 0.94 0.52
CWIP 3.04 0.46 0.01 0.81 0.02 0.08 0.00 4.52 0.55 0.00 0.88 0.00 0.76 0.00 4.18 0.51 0.00 0.85 0.00 0.84 0.00
CWOOP −1.52 0.38 0.97 0.44 0.97 0.25 0.96 −1.54 0.44 0.97 0.46 0.97 0.57 0.97 −0.21 0.54 0.75 0.54 0.76 0.61 0.77
48

α̂ 3.14 0.34 0.02 0.82 0.04 0.84 0.01 5.70 0.46 0.00 0.94 0.00 1.32 0.00 5.04 0.57 0.00 1.07 0.00 1.44 0.01
tα̂ 4.26 0.27 0.00 0.68 0.00 1.11 0.00 4.95 0.33 0.00 0.70 0.00 1.11 0.00 4.45 0.44 0.00 0.82 0.00 1.23 0.00
SR 0.77 0.42 0.01 0.44 0.03 0.42 0.02 0.85 0.42 0.01 0.43 0.01 0.41 0.01 0.84 0.42 0.01 0.44 0.01 0.42 0.01
rvar
CWF S −1.49 −0.03 0.92 −0.21 0.91 −0.37 0.86 −0.79 −0.03 0.77 0.13 0.81 −0.05 0.75 −0.44 0.20 0.73 0.27 0.75 0.19 0.72
CWIP 2.88 −0.06 0.00 −0.01 0.00 −0.35 0.00 3.93 −0.06 0.00 0.28 0.00 −0.13 0.00 3.10 0.07 0.00 0.31 0.00 −0.09 0.00
CWOOP −1.77 −0.01 0.96 −0.28 0.94 −0.22 0.95 −1.07 −0.02 0.84 −0.12 0.82 0.02 0.85 −0.97 0.17 0.88 0.08 0.85 0.27 0.90
α̂ 2.31 −0.14 0.02 0.61 0.06 0.34 0.01 2.89 −0.18 0.01 0.95 0.06 0.76 0.05 2.69 0.02 0.02 0.92 0.09 0.55 0.05
tα̂ 3.63 −0.15 0.00 0.63 0.00 0.51 0.00 3.47 −0.17 0.00 0.76 0.00 0.68 0.00 3.03 −0.01 0.00 0.74 0.01 0.51 0.01
SR 0.68 0.44 0.05 0.46 0.07 0.55 0.21 0.71 0.45 0.03 0.47 0.05 0.55 0.15 0.54 0.46 0.26 0.46 0.31 0.55 0.51

Table 4: OOS statistical model simulations (daily). This table reports Monte Carlo simulation results for the empirical 1-sided Kernel empirical
findings. We consider 3 ways of bootstrapping the fitted residuals from a constant coefficient predictive regression model for excess returns and an AR(1)
model for the predictor: (i) an i.i.d. heteroskedastic bootstrap, (ii) a stationary block bootstrap where the optimal block length is chosen according to Politis
and White (2004), (iii) an EGARCH(1,1) with t-distributed shocks. All residuals are resampled jointly to preserve the cross-sectional correlation between
the innovations to the predictor and excess returns. We generate 1,000 bootstrap samples of the same sample size as is available for each predictor in the
data. A pocket is classified as a period where a fitted squared forecast error differential (estimated using a 1-sided Kernel with a 1-year effective sample
size) is above 0 in the preceding period. We report 6 statistics. The first 3 are Clark and West (2007) t-statistics relative to a prevailing mean benchmark
in the full sample, in-pocket, and out-of-pocket. The second 3 are economic statistics associated with returns on a portfolio which utilizes the time-varying
coefficient model forecast in-pocket and the prevailing mean forecast out-of-pocket to allocate between the risk-free asset and the market (portfolio weights
are limited to be between 0 and 2): the annualized estimated alpha in percentage points, the HAC t-statistic associated with that estimated alpha, and the
annualized Sharpe Ratio of the portfolio. Column 2 presents the corresponding statistics from the data for reference.
Unrestricted
Variables Full sample In-pocket (real time) Out-of-pocket (real time)
2.5yCoef, 1ySED 2yCoef, 1ySED 3yCoef, 1ySED 2.5yCoef, 6mSED 2.5yCoef, 1.5ySED 2.5yCoef, 1ySED 2yCoef, 1ySED 3yCoef, 1ySED 2.5yCoef, 6mSED 2.5yCoef, 1.5ySED
dp −0.74 3.00∗∗∗ 2.92∗∗∗ 3.43∗∗∗ 3.59∗∗∗ 2.57∗∗∗ −1.62† −2.12†† −1.62† −2.28†† −1.47†
tbl 0.68 3.28∗∗∗ 3.54∗∗∗ 3.32∗∗∗ 3.88∗∗∗ 3.88∗∗∗ −1.58† −1.00 −1.73†† −1.61† −1.98††
tsp 0.15 3.04∗∗∗ 2.83∗∗∗ 3.29∗∗∗ 4.17∗∗∗ 2.77∗∗∗ −1.52† −1.08 −1.63† −1.84†† −1.38†
rvar −1.49† 2.88∗∗∗ 2.42∗∗∗ 2.88∗∗∗ 4.60∗∗∗ 1.93∗∗ −1.77†† −1.82†† −1.83†† −2.05†† −1.77††
mv −0.99 3.74∗∗∗ 2.65∗∗∗ 4.08∗∗∗ 4.58∗∗∗ 1.68∗∗ −1.49† −1.41† −1.29† −1.80†† −1.28
pc 0.99 2.71∗∗∗ 3.36∗∗∗ 2.96∗∗∗ 5.05∗∗∗ 3.55∗∗∗ −0.56 −0.80 −0.56 −2.08†† −1.28
comb1 4.48∗∗∗ 4.58∗∗∗ 4.34∗∗∗ 4.74∗∗∗ 5.27∗∗∗ 4.64∗∗∗ – – – – –
comb2 4.94∗∗∗ 5.05∗∗∗ 4.88∗∗∗ 5.25∗∗∗ 5.72∗∗∗ 5.13∗∗∗ – – – – –
comb3 −1.03 2.11∗∗ 2.68∗∗∗ 2.05∗∗ 2.44∗∗∗ 2.31∗∗ −2.02†† −2.34††† −1.95†† −2.20†† −2.11††
+ excess returns
Variables Full sample In-pocket (real time) Out-of-pocket (real time)
2.5yCoef, 1ySED 2yCoef, 1ySED 3yCoef, 1ySED 2.5yCoef, 6mSED 2.5yCoef, 1.5ySED 2.5yCoef, 1ySED 2yCoef, 1ySED 3yCoef, 1ySED 2.5yCoef, 6mSED 2.5yCoef, 1.5ySED
dp 0.40 3.79∗∗∗ 3.95∗∗∗ 4.05∗∗∗ 4.30∗∗∗ 3.11∗∗∗ −1.94†† −2.09†† −1.90†† −2.47††† −1.65††
tbl 1.98∗∗ 4.75∗∗∗ 3.98∗∗∗ 4.06∗∗∗ 5.15∗∗∗ 4.46∗∗∗ −1.33† 0.15 −1.23 −1.76†† −0.76
tsp 0.95 4.52∗∗∗ 4.29∗∗∗ 4.24∗∗∗ 6.44∗∗∗ 3.05∗∗∗ −1.54† −0.63 −1.28 −2.39††† −0.48
rvar −0.79 3.93∗∗∗ 4.50∗∗∗ 3.09∗∗∗ 5.17∗∗∗ 2.11∗∗ −1.07 −1.27 −1.31† −1.44† −1.24
mv −0.01 4.01∗∗∗ 4.13∗∗∗ 4.06∗∗∗ 4.59∗∗∗ 2.98∗∗∗ −1.22 −1.17 −1.04 −1.52† −0.92
pc 1.85∗∗ 4.69∗∗∗ 3.93∗∗∗ 4.34∗∗∗ 6.62∗∗∗ 4.53∗∗∗ −0.52 0.46 −0.31 −1.78†† −0.43
comb1 6.35∗∗∗ 6.50∗∗∗ 6.06∗∗∗ 5.50∗∗∗ 7.76∗∗∗ 5.17∗∗∗ – – – – –
comb2 6.15∗∗∗ 6.28∗∗∗ 5.86∗∗∗ 6.16∗∗∗ 7.41∗∗∗ 5.29∗∗∗ – – – – –
comb3 0.23 0.74 0.64 2.71∗∗∗ 2.90∗∗∗ 3.33∗∗∗ −1.34† −0.89 −1.43† −1.69†† −1.55†
All sign restrictions
Variables Full sample In-pocket (real time) Out-of-pocket (real time)
2.5yCoef, 1ySED 2yCoef, 1ySED 3yCoef, 1ySED 2.5yCoef, 6mSED 2.5yCoef, 1.5ySED 2.5yCoef, 1ySED 2yCoef, 1ySED 3yCoef, 1ySED 2.5yCoef, 6mSED 2.5yCoef, 1.5ySED
dp 0.68 4.03∗∗∗ 3.82∗∗∗ 4.28∗∗∗ 4.45∗∗∗ 3.15∗∗∗ −1.84†† −1.94†† −1.75†† −2.46††† −1.55†
49

tbl 2.03∗∗ 4.69∗∗∗ 4.22∗∗∗ 4.12∗∗∗ 4.98∗∗∗ 4.50∗∗∗ −1.21 0.52 −1.34† −1.45† −0.56
tsp 0.79 4.18∗∗∗ 3.68∗∗∗ 3.99∗∗∗ 5.64∗∗∗ 3.11∗∗∗ −0.21 1.24 −0.24 −0.19 0.38
rvar −0.44 3.10∗∗∗ 2.05∗∗ 3.24∗∗∗ 4.79∗∗∗ 2.38∗∗∗ −0.97 −1.05 −1.20 −1.41† −1.11
mv −0.01 4.01∗∗∗ 4.13∗∗∗ 4.06∗∗∗ 4.59∗∗∗ 2.98∗∗∗ −1.22 −1.17 −1.04 −1.52† −0.92
pc 1.85∗∗ 4.69∗∗∗ 3.93∗∗∗ 4.34∗∗∗ 6.62∗∗∗ 4.53∗∗∗ −0.52 0.46 −0.31 −1.78†† −0.43
comb1 6.20∗∗∗ 6.35∗∗∗ 5.56∗∗∗ 5.64∗∗∗ 7.73∗∗∗ 5.29∗∗∗ – – – – –
comb2 6.28∗∗∗ 6.44∗∗∗ 5.93∗∗∗ 6.44∗∗∗ 7.71∗∗∗ 5.43∗∗∗ – – – – –
comb3 0.66 2.59∗∗∗ 1.25 2.39∗∗∗ 2.65∗∗∗ 2.78∗∗∗ −1.19 0.25 −1.02 −1.23 −1.01

Table 5: Robustness of out-of-sample measures of forecasting performance. This table reports the Clark and West (2007) test
statistics for out-of-sample return predictability measured relative to a prevailing mean forecast for different combinations of bandwidths for
both the coefficient from the predictive regression and fitted squared forecast error differential estimation. In each column header, the first
duration corresponds to the effective sample size for estimating the coefficient and the second duration corresponds to the effective sample
size for estimating the fitted squared forecast error differential. “pc” is a recursively computed first principal component of the four predictor
variables. “mv” is a four-variable multivariate forecast estimated using a product Kernel. “comb1,” “comb2,” and “comb3” refer to using a
simple average of the univariate forecasts. “comb1” sets an individual predictor’s forecast to the time-varying coefficient model forecast during
a pocket and to the prevailing mean otherwise. “comb2” is the same as “comb1” except that it ignores individual predictor forecasts when
that variable is not in a pocket but at least one other variable is in a pocket. “comb3” makes no distinction between pocket and non-pocket
periods and always uses the simple equal-weighted average of all four univariate models. The CW test statistics approximately follow a normal
distribution with positive values indicating more accurate out-of-sample return forecasts than the prevailing mean benchmark and negative
values indicating the opposite. A pocket is classified as a period where a fitted squared forecast error differential is above 0 in the preceding
period. Consider a particular statistic of interest, β. ∗’s represent statistical significance at either the 10, 5, or 1% levels from a hypothesis test
of β > 0. †’s represent statistical significance at either the 10, 5, or 1% levels from a hypothesis test of β < 0.
Unrestricted + excess return forecasts All sign restrictions
Variables Mkt Mkt-Vol Mkt-Mom Mkt-Vol-Mom Mkt Mkt-Vol Mkt-Mom Mkt-Vol-Mom Mkt Mkt-Vol Mkt-Mom Mkt-Vol-Mom
1.69∗∗ 1.70∗∗ 1.51∗∗ 1.48∗∗ 2.51∗∗∗ 2.50∗∗∗ 2.14∗∗∗ 2.09∗∗∗ 2.95∗∗∗ 2.94∗∗∗ 2.53∗∗∗ 2.48∗∗∗
dp
(2.10) (2.12) (1.86) (1.81) (2.89) (2.90) (2.52) (2.47) (3.26) (3.27) (2.87) (2.82)
3.57∗∗∗ 3.51∗∗∗ 3.23∗∗∗ 3.23∗∗∗ 6.48∗∗∗ 6.21∗∗∗ 5.53∗∗∗ 5.59∗∗∗ 6.07∗∗∗ 5.86∗∗∗ 5.28∗∗∗ 5.32∗∗∗
tbl
(4.35) (4.28) (3.91) (3.90) (5.56) (5.65) (5.02) (5.15) (5.37) (5.36) (4.82) (4.87)
3.15∗∗∗ 3.09∗∗∗ 2.85∗∗∗ 2.84∗∗∗ 5.70∗∗∗ 5.44∗∗∗ 4.84∗∗∗ 4.91∗∗∗ 5.04∗∗∗ 4.87∗∗∗ 4.27∗∗∗ 4.29∗∗∗
tsp
(4.26) (4.21) (3.85) (3.84) (4.95) (5.08) (4.40) (4.53) (4.45) (4.49) (4.02) (4.04)
2.31∗∗∗ 2.30∗∗∗ 1.68∗∗∗ 1.64∗∗∗ 2.89∗∗∗ 2.87∗∗∗ 2.21∗∗∗ 2.16∗∗∗ 2.69∗∗∗ 2.67∗∗∗ 2.08∗∗∗ 2.04∗∗∗
rvar
(3.63) (3.60) (2.79) (2.78) (3.47) (3.44) (2.85) (2.84) (3.03) (3.05) (2.50) (2.45)
2.59∗∗∗ 2.59∗∗∗ 2.51∗∗∗ 2.49∗∗∗ 4.79∗∗∗ 4.80∗∗∗ 4.71∗∗∗ 4.68∗∗∗ 4.79∗∗∗ 4.80∗∗∗ 4.71∗∗∗ 4.68∗∗∗
mv
(3.37) (3.35) (3.21) (3.20) (4.97) (4.96) (4.80) (4.78) (4.97) (4.96) (4.80) (4.78)
3.43∗∗∗ 3.34∗∗∗ 2.98∗∗∗ 2.99∗∗∗ 5.87∗∗∗ 5.59∗∗∗ 4.82∗∗∗ 4.89∗∗∗ 5.87∗∗∗ 5.59∗∗∗ 4.82∗∗∗ 4.89∗∗∗
pc
(3.97) (3.90) (3.54) (3.56) (5.01) (5.14) (4.53) (4.67) (5.01) (5.14) (4.53) (4.67)
6.38∗∗∗ 6.33∗∗∗ 5.71∗∗∗ 5.66∗∗∗ 6.72∗∗∗ 6.55∗∗∗ 5.83∗∗∗ 5.84∗∗∗ 6.69∗∗∗ 6.59∗∗∗ 5.90∗∗∗ 5.88∗∗∗
comb1
(6.11) (6.06) (5.68) (5.65) (6.71) (6.76) (6.20) (6.21) (6.46) (6.44) (5.96) (5.91)
6.10∗∗∗ 6.08∗∗∗ 5.46∗∗∗ 5.41∗∗∗ 8.53∗∗∗ 8.33∗∗∗ 7.47∗∗∗ 7.50∗∗∗ 8.36∗∗∗ 8.27∗∗∗ 7.44∗∗∗ 7.40∗∗∗
comb2
(5.66) (5.64) (5.09) (5.03) (6.69) (6.78) (5.92) (5.95) (6.51) (6.50) (5.87) (5.80)
0.76∗ 0.70 0.27 0.25 2.34∗∗ 2.23∗∗ 1.27 1.24 2.72∗∗ 2.69∗∗ 1.82∗ 1.75∗
comb3
(1.32) (1.22) (0.52) (0.50) (1.87) (1.80) (1.14) (1.11) (2.07) (2.07) (1.52) (1.47)
50

−0.25† −0.29†† −0.39††† −0.38††† −0.25† −0.29†† −0.39††† −0.38††† −0.25† −0.29†† −0.39††† −0.38†††
pm
(−1.58) (−2.00) (−2.68) (−2.71) (−1.58) (−2.00) (−2.68) (−2.71) (−1.58) (−2.00) (−2.68) (−2.71)

Table 6: Robustness of out-of-sample economic forecasting performance to the inclusion of additional factors. This table reports annualized
estimated alphas in percentage points associated with returns on a daily portfolio which utilizes the time-varying coefficient model forecast in-pocket and
the prevailing mean forecast out-of-pocket to allocate between the risk-free asset and the market (portfolio weights are limited to be between 0 and 2).
We consider 4 specifications for estimating alpha: 1) the CAPM model which has excess returns on the market portfolio as the only factor; 2) a 2-factor
model which uses the market factor and a volatility factor constructed as in Moreira and Muir (2017); 3) a 2-factor model which uses the market factor
and a momentum factor on the market constructed using equation (5) on p. 236 of Moskowitz et al. (2012); 4) a 3-factor model which includes the market,
volatility, and momentum factors. Significance of the estimated alpha is assessed using a t-statistic estimated using HAC standard errors, which is reported
in parantheses below each alpha estimate. We use a purely backward-looking kernel to compute forecasts. “pc” is a recursively computed first principal
component of the four predictor variables. “mv” is a four-variable multivariate forecast estimated using a product Kernel. “comb1,” “comb2,” and “comb3”
refer to using a simple average of the univariate forecasts. “comb1” sets an individual predictor’s forecast to the time-varying coefficient model forecast
during a pocket and to the prevailing mean otherwise. “comb2” is the same as “comb1” except that it ignores individual predictor forecasts when that
variable is not in a pocket but at least one other variable is in a pocket. “comb3” makes no distinction between pocket and non-pocket periods and always
uses the simple equal-weighted average of all four univariate models. A pocket is classified as a period where a fitted squared forecast error differential
(estimated using a 1-sided Kernel with a 1-year effective sample size) is above 0 in the preceding period. Consider a particular statistic of interest, β. ∗’s
represent statistical significance at either the 10, 5, or 1% levels from a hypothesis test of β > 0. †’s represent statistical significance at either the 10, 5, or
1% levels from a hypothesis test of β < 0.
Panel A: Clark-West statistics
Unrestricted + excess return forecasts All sign restrictions
Variables Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket
dp 0.96 4.05∗∗∗ −0.09 1.03 4.14∗∗∗ −3.12††† 1.13 4.14∗∗∗ −3.01†††
tbl 1.25 3.55∗∗∗ −0.83 1.23 4.38∗∗∗ −1.99†† 2.40∗∗∗ 4.47∗∗∗ −1.26
tsp 0.78 2.44∗∗∗ −1.15 0.28 4.75∗∗∗ −1.45† 0.46 4.93∗∗∗ −0.20
rvar 0.64 3.28∗∗∗ 0.00 0.40 3.18∗∗∗ −2.73††† 1.04 3.58∗∗∗ −2.10††
mv 1.76∗∗ 3.18∗∗∗ 1.28 1.65∗∗ 3.70∗∗∗ −0.69 1.65∗∗ 3.70∗∗∗ −0.69
pc 1.22 3.23∗∗∗ −1.23 1.04 4.64∗∗∗ −1.25 1.04 4.64∗∗∗ −1.25
comb1 4.14∗∗∗ 4.65∗∗∗ – 4.73∗∗∗ 5.04∗∗∗ – 4.82∗∗∗ 5.13∗∗∗ –
comb2 4.48∗∗∗ 5.11∗∗∗ – 4.81∗∗∗ 5.09∗∗∗ – 5.48∗∗∗ 5.90∗∗∗ –
comb3 1.01 2.09∗∗ −1.97†† 1.10 1.86∗∗ −1.31† 1.79∗∗ 2.12∗∗ −0.93
Panel B: Economic significance
Unrestricted + excess return forecasts All sign restrictions
Variables α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio
dp 2.35∗∗∗ 2.51 0.54 4.09∗∗∗ 3.24 0.69 4.09∗∗∗ 3.24 0.69
tbl 3.75∗∗∗ 3.34 0.76 6.04∗∗∗ 4.43 0.86 6.17∗∗∗ 4.40 0.86
tsp 2.20∗∗∗ 2.64 0.65 5.09∗∗∗ 3.52 0.76 4.06∗∗∗ 3.21 0.70
rvar 2.19∗∗∗ 2.71 0.55 3.51∗∗∗ 3.25 0.65 3.71∗∗∗ 3.67 0.66
mv 1.27∗∗ 1.97 0.47 3.52∗∗∗ 3.29 0.60 3.52∗∗∗ 3.29 0.60
pc 3.74∗∗∗ 3.41 0.77 5.34∗∗∗ 3.82 0.79 5.34∗∗∗ 3.82 0.79
51

comb1 6.95∗∗∗ 5.18 1.03 6.63∗∗∗ 5.10 1.00 6.57∗∗∗ 5.56 1.08
comb2 6.56∗∗∗ 4.97 0.94 7.98∗∗∗ 6.11 1.00 8.80∗∗∗ 6.16 1.03
comb3 1.41∗ 1.52 0.47 2.23∗ 1.35 0.45 4.48∗∗∗ 3.22 0.62
pm −0.39†† −2.04 0.50 −0.39†† −2.04 0.50 −0.39†† −2.04 0.50

Table 7: Out-of-sample measures of forecasting performance (monthly benchmark specification). Panel A reports the Clark and West (2007)
test statistics for out-of-sample return predictability measured relative to a prevailing mean forecast. Panel B reports 3 measures of economic significance
associated with returns on a portfolio which utilizes the time-varying coefficient model forecast in-pocket and the prevailing mean forecast out-of-pocket to
allocate between the risk-free asset and the market (portfolio weights are limited to be between 0 and 2): the annualized estimated alpha in percentage
points, the HAC t-statistic for the estimated alpha, and the annualized Sharpe Ratio of the portfolio. We use a purely backward-looking kernel with an
effective sample size of 2.5 years to compute forecasts. “pc” is a recursively computed first principal component of the four predictor variables. “mv” is
a four-variable multivariate forecast estimated using a product Kernel. “comb1,” “comb2,” and “comb3” refer to using a simple average of the univariate
forecasts. “comb1” sets an individual predictor’s forecast to the time-varying coefficient model forecast during a pocket and to the prevailing mean otherwise.
“comb2” is the same as “comb1” except that it ignores individual predictor forecasts when that variable is not in a pocket but at least one other variable
is in a pocket. “comb3” makes no distinction between in-pocket and out-of-pocket periods and always uses the simple equal-weighted average of all four
univariate models. The CW test statistics approximately follow a normal distribution with positive values indicating more accurate out-of-sample return
forecasts than the prevailing mean benchmark and negative values indicating the opposite. A pocket is classified as a period where a fitted squared forecast
error differential (estimated using a 1-sided Kernel with a 1-year effective sample size) is above 0 in the preceding period. Consider a particular statistic
of interest, β. ∗’s represent statistical significance at either the 10, 5, or 1% levels from a hypothesis test of β > 0. †’s represent statistical significance at
either the 10, 5, or 1% levels from a hypothesis test of β < 0.
Panel A: Clark-West statistics
SMB HML
Variables Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket
dp 2.03∗∗ 4.75∗∗∗ 0.55 1.41∗ 4.49∗∗∗ −0.91
tbl 1.77∗∗ 4.83∗∗∗ −0.15 1.90∗∗ 3.88∗∗∗ −0.21
tsp 2.83∗∗∗ 3.58∗∗∗ 0.42 1.93∗∗ 3.83∗∗∗ −1.49†
rvar 0.89 4.98∗∗∗ 0.63 −0.10 4.22∗∗∗ −1.33†
mv 2.97∗∗∗ 5.94∗∗∗ 1.90∗∗ 2.05∗∗ 5.18∗∗∗ −0.09
pc 3.48∗∗∗ 3.60∗∗∗ 1.42∗ 2.01∗∗ 3.35∗∗∗ −0.74
comb1 5.75∗∗∗ 6.00∗∗∗ – 5.46∗∗∗ 5.61∗∗∗ –
comb2 4.67∗∗∗ 4.86∗∗∗ – 5.33∗∗∗ 5.46∗∗∗ –
comb3 2.02∗∗ 3.56∗∗∗ 0.38 1.49∗ 4.21∗∗∗ −1.65††
Panel B: Economic significance
SMB HML
Variables α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio
dp 2.95∗∗∗ 3.66 0.81 3.29∗∗∗ 5.13 1.18
tbl 3.35∗∗∗ 4.44 0.90 2.89∗∗∗ 4.40 1.07
tsp 2.57∗∗∗ 4.26 0.95 2.33∗∗∗ 3.74 0.80
rvar 3.43∗∗∗ 4.80 1.00 2.97∗∗∗ 4.28 1.11
mv 3.27∗∗∗ 4.23 0.85 3.74∗∗∗ 4.96 0.99
pc 2.43∗∗∗ 4.07 0.86 1.94∗∗∗ 3.19 0.74
comb1 5.15∗∗∗ 6.22 1.34 4.46∗∗∗ 5.80 1.18
comb2 4.07∗∗∗ 5.56 1.19 3.87∗∗∗ 5.50 1.07
comb3 1.18∗∗ 2.12 0.45 1.40∗∗ 1.87 0.62
pm −0.38 −0.75 0.17 −0.17† −1.55 0.62

Table 8: Out-of-sample measures of forecasting performance (Fama-French factor portfolio


excess returns, daily). Panel A reports the Clark and West (2007) test statistics for out-of-sample return
predictability measured relative to a prevailing mean forecast. Panel B reports 3 measures of economic
significance associated with returns on a portfolio which utilizes the time-varying coefficient model forecast
in-pocket and the prevailing mean forecast out-of-pocket to allocate between small and big or high and low
(portfolio weights are limited to be between 0 and 2): the annualized estimated alpha in percentage points,
the t-statistic on the estimated alpha, and the annualized Sharpe Ratio of the portfolio. Significance of
the estimated alpha is assessed using a t-statistic estimated using HAC standard errrors. We use a purely
backward-looking kernel to compute forecasts. “pc” is a recursively computed first principal component
of the four predictor variables. “comb1,” “comb2,” and “comb3” refer to using a simple average of the
univariate forecasts. “comb1” sets an individual predictor’s forecast to the time-varying coefficient model
forecast during a pocket and to the prevailin mean otherwise. “comb2” is the same as “comb1” except that
it ignores individual predictor forecasts when that variable is not in a pocket but at least one other variable
is in a pocket. “comb3” makes no distinction between pocket and non-pocket periods and always uses the
simple equal-weighted average of all four univariate models. The CW test statistics approximately follow
a normal distribution with positive values indicating more accurate out-of-sample return forecasts than the
prevailing mean benchmark and negative values indicating the opposite. A pocket is classified as a period
where a fitted squared forecast error differential (estimated using a 1-sided Kernel with a 1-year effective
sample size) is above 0 in the preceding period. Consider a particular statistic of interest, β. ∗’s represent
statistical significance at either the 10, 5, or 1% levels from a hypothesis test of β > 0. †’s represent statistical
significance at either the 10, 5, or 1% levels from a hypothesis test of β < 0.

52
Bansal-Yaron Campbell-Cochrane Garleanu-Panageas Wachter Wachter (no disasters)
Stats Sample Avg. Std. err. p-val Avg. Std. err. p-val Avg. Std. err. p-val Avg. Std. err. p-val Avg. Std. err. p-val
dp
CWF S −0.74 −0.10 0.98 0.74 0.09 0.98 0.81 −0.03 0.96 0.78 0.31 1.02 0.84 0.65 1.05 0.90
CWIP 3.00 −0.07 1.02 0.00 −0.08 0.99 0.00 −0.09 0.97 0.00 0.15 1.00 0.00 0.34 1.00 0.00
CWOOP −1.62 −0.10 1.01 0.93 0.15 0.99 0.96 −0.00 0.97 0.95 0.25 1.03 0.97 0.54 1.04 0.97
α̂ 1.69 −0.38 1.67 0.11 0.05 0.95 0.04 −0.02 0.86 0.02 0.21 1.82 0.19 0.31 1.34 0.15
tα̂ 2.10 −0.23 1.00 0.01 0.05 1.01 0.02 −0.02 0.99 0.01 0.19 1.08 0.04 0.22 1.02 0.03
SR 0.47 0.44 0.13 0.40 0.47 0.07 0.54 0.33 0.11 0.08 0.46 0.12 0.46 0.58 0.10 0.89
risk-free
CWF S 0.68 −0.10 0.99 0.22 0.10 0.98 0.28 −0.06 0.96 0.22 0.31 1.02 0.36 0.65 1.06 0.50
CWIP 3.28 −0.09 1.03 0.00 −0.06 0.98 0.00 −0.09 0.97 0.00 0.15 1.00 0.00 0.35 1.00 0.00
CWOOP −1.58 −0.09 1.01 0.93 0.14 0.99 0.96 −0.04 0.96 0.94 0.25 1.03 0.96 0.54 1.04 0.97
α̂ 3.57 −0.37 1.67 0.01 0.05 0.95 0.00 −0.03 0.83 0.00 0.21 1.82 0.03 0.31 1.34 0.01
53

tα̂ 4.35 −0.23 1.00 0.00 0.05 1.01 0.00 −0.02 0.99 0.00 0.19 1.08 0.00 0.22 1.02 0.00
SR 0.79 0.44 0.13 0.01 0.47 0.07 0.00 0.33 0.11 0.00 0.46 0.12 0.00 0.58 0.10 0.03
rvar
CWF S −1.49 −0.08 1.03 0.91 −0.26 0.99 0.89 −0.13 0.97 0.92 0.05 1.00 0.95 0.22 1.08 0.95
CWIP 2.88 −0.08 1.02 0.00 −0.24 1.01 0.00 −0.11 0.96 0.00 −0.01 0.97 0.00 0.08 1.00 0.00
CWOOP −1.77 −0.06 1.01 0.95 −0.15 0.99 0.95 −0.10 0.98 0.95 0.10 1.01 0.97 0.22 1.02 0.98
α̂ 2.31 −0.38 1.70 0.06 −0.06 0.96 0.01 −0.04 0.85 0.01 0.05 1.29 0.04 0.17 1.33 0.06
tα̂ 3.63 −0.23 1.00 0.00 0.05 1.01 0.00 −0.02 0.99 0.00 0.19 1.08 0.00 0.22 1.02 0.00
SR 0.68 0.44 0.13 0.03 0.47 0.07 0.01 0.33 0.11 0.00 0.45 0.12 0.03 0.58 0.10 0.12

Table 9: OOS asset pricing model simulations (unrestricted). This table reports Monte Carlo simulation results of our 1-sided Kernel estimation
applied to data simulated from 4 different asset pricing models (this includes two specifications of Wachter’s rare disasters model, one of which omits data
from disaster episodes). We report 6 statistics. The first 3 are Clark and West (2007) t-statistics relative to a prevailing mean benchmark in the full sample,
in-pocket, and out-of-pocket. The second 3 are economic statistics associated with returns on a portfolio which utilizes the time-varying coefficient model
forecast in-pocket and the prevailing mean forecast out-of-pocket to allocate between the risk-free asset and the market (portfolio weights are limited to be
between 0 and 2): the annualized estimated alpha in percentage points, the HAC t-statistic associated with that alpha, and the annualized Sharpe Ratio
of the portfolio. Column 2 presents the corresponding statistics from the data for reference.
Baseline Baseline (λ = 0) RE recalibrated
Stats Data Avg. Std. err p-val Avg. Std. err p-val Avg. Std. err p-val
dp
CWf s -0.74 1.43 1.21 0.09 0.13 1.01 0.40 0.26 1.02 0.34
CWip 3.00 2.39 1.12 0.59 0.00 0.93 0.00 0.03 1.01 0.01
CWoop -1.62 -0.65 1.03 0.36 0.15 1.01 0.10 0.28 0.98 0.07
α 1.69 1.09 1.62 0.71 0.02 1.92 0.40 0.22 1.44 0.32
tα 2.10 0.79 1.19 0.28 0.01 0.99 0.05 0.16 1.01 0.07
SR 0.47 0.50 0.18 0.87 0.33 0.12 0.26 0.43 0.11 0.74
rf
CWf s 0.68 3.01 1.05 0.04 -0.17 1.04 0.42 -0.31 1.02 0.34
CWip 3.28 3.54 1.06 0.81 -0.15 0.99 0.00 -0.28 0.99 0.00
CWoop -1.58 0.42 1.08 0.08 -0.13 1.09 0.20 -0.18 1.06 0.20
α 3.57 3.70 1.58 0.93 -0.49 2.03 0.06 -0.53 1.45 0.01
tα 4.35 2.62 1.06 0.12 -0.27 1.04 0.00 -0.38 1.02 0.00
SR 0.79 0.62 0.18 0.37 0.33 0.13 0.00 0.43 0.12 0.01
rvar
CWf s -1.49 2.23 1.05 0.00 -0.20 0.98 0.21 -0.45 0.98 0.30
CWip 2.88 2.99 1.10 0.92 -0.14 1.02 0.01 -0.36 0.98 0.00
CWoop -1.77 -0.08 1.01 0.11 -0.17 1.00 0.13 -0.30 1.00 0.16
α 2.31 2.65 1.52 0.83 -0.65 1.95 0.15 -0.79 1.40 0.04
tα 3.63 1.87 1.05 0.11 -0.34 0.99 0.00 -0.56 0.98 0.00
SR 0.68 0.56 0.18 0.51 0.33 0.13 0.01 0.44 0.12 0.05
comb1
CWf s -4.48 3.47 1.00 0.00 -0.13 0.95 0.00 -0.29 0.97 0.00
CWip 4.57 3.51 1.03 0.32 -0.13 0.95 0.00 -0.29 0.97 0.00
CWoop - -0.01 0.96 - 0.02 0.97 0.98 0.01 0.99 0.99
α 6.38 4.06 1.47 0.13 -0.46 1.65 0.00 -0.44 1.16 0.00
tα 6.11 3.25 1.05 0.01 -0.27 0.94 0.77 -0.37 0.95 0.70
SR 1.00 0.69 0.18 0.11 0.33 0.13 0.00 0.43 0.12 0.00
comb2
CWf s 4.94 3.62 1.11 0.25 -0.06 0.95 0.00 -0.18 0.97 0.00
CWip 5.04 3.62 1.11 0.22 -0.06 0.95 0.00 -0.18 0.97 0.00
CWoop - 3.62 1.11 - -0.06 0.95 0.95 -0.18 0.97 0.86
α 6.10 5.67 2.05 0.84 -0.37 2.22 0.01 -0.39 1.65 0.00
tα 5.66 3.27 1.12 0.05 -0.16 0.94 0.87 -0.23 0.96 0.81
SR 0.87 0.78 0.20 0.65 0.33 0.13 0.00 0.44 0.12 0.00
comb3
CWf s -1.03 2.73 1.07 0.00 -0.09 1.00 0.36 -0.20 0.97 0.41
CWip 2.32 3.38 1.10 0.35 -0.11 1.02 0.03 -0.22 0.99 0.02
CWoop -2.12 -0.57 1.09 0.17 0.00 0.99 0.05 -0.05 0.99 0.05
α 0.76 3.21 1.60 0.14 -0.47 1.96 0.54 -0.41 1.39 0.41
tα 1.32 2.25 1.08 0.40 -0.24 0.99 0.81 -0.30 0.97 0.76
SR 0.43 0.59 0.18 0.39 0.33 0.13 0.43 0.43 0.11 0.99
mv
CWf s -0.99 2.21 1.10 0.01 0.12 1.06 0.31 0.12 1.06 0.31
CWip 3.74 2.76 1.12 0.39 -0.02 0.98 0.00 -0.02 0.98 0.00
CWoop -1.49 0.32 1.00 0.09 0.15 1.05 0.13 0.15 1.05 0.13
α 2.59 2.49 1.61 0.95 0.12 2.02 0.24 0.12 2.02 0.24
tα 3.37 1.78 1.13 0.18 0.07 1.03 0.95 0.07 1.03 0.95
SR 0.58 0.56 0.18 0.89 0.33 0.12 0.06 0.33 0.12 0.06

Table 10: Sticky expectations model simulation results. This table reports Monte Carlo results for
the 1-sided Kernel empirical results using simulated data from the Sticky Expectations Model. We generate
500 bootstrap samples of the same sample size as is available for each predictor in the data for three separate
calibrations. “Baseline” refers to the standard calibration with sticky expectations, “Baseline (λ = 0)” refers
to the “Baseline” calibration but with rational expectations (i.e., λ = 0), and “RE Recalibrated” refers to
a recalibration of the rational expectations model to match the target moments. “dp” refers to the log
dividend-price ratio, “rf” refers to the log risk-free rate, and “rvar” refers to realized variance on a 60-
day trailing window. A pocket is classified as a period where a fitted (using a 1-sided Kernel with a 1-year
effective sample size) squared forecast error differential is above 0 in the preceding period. For each predictor
and each calibration, we report 6 statistics. The first 3 are Clark and West (2007) t-statistics relative to
a prevailing mean benchmark in the full sample, in-pocket, and out-of-pocket. The second 3 are economic
statistics associated with returns on a portfolio which utilizes the time-varying coefficient model forecast
in-pockets and the prevailing mean forecast out-of-pocket to allocated between the risk-free asset and the
market (portfolio weights are limited to be between 0 and 2): the annualized estimated alpha in percentage
points, the HAC t-statistic associated with that alpha, and the annualized Sharpe Ratio of the portfolio.
The column “Data” reports the corresponding statistics from the data for reference.

54
55

Figure 1: Local return predictability (daily benchmark specification). The first four panels plot 1-sided non-parametric kernel
estimates of the fitted squared forecast error differential SED
[ t (estimated using a 1-sided Kernel with a 1-year effective sample size) from a
regression of daily excess stock returns on each of the four predictor variables using an effective sample size of 2.5 years. The final panel plots
the local SED
[ t from a four-variable regression specification with coefficients estimated using a product Kernel. The shaded areas represent
periods when SED
[ t > 0, with areas colored in red representing pockets that have less than a 5% chance of being spurious, areas colored in blue
representing pockets that have more than a 5% chance of being spurious. The sampling distributions used to determine spuriousness come from
an EGARCH(1,1) residiual bootstrap design.
dp
5
0
-5
1940 1950 1960 1970 1980 1990 2000 2010
tbl
10
5
0
-5
1940 1950 1960 1970 1980 1990 2000 2010
tsp
10
5
0
-5
1940 1950 1960 1970 1980 1990 2000 2010
rvar
5
56

0
-5
-10
1940 1950 1960 1970 1980 1990 2000 2010
mv
0
-5
-10
-15
1940 1950 1960 1970 1980 1990 2000 2010
Figure 2: Local return predictability (monthly benchmark specification). The first four panels plot 1-sided non-parametric kernel
estimates of the fitted squared forecast error differential SED
[ t (estimated using a 1-sided Kernel with a 1-year effective sample size) from a
regression of daily excess stock returns on each of the four predictor variables using an effective sample size of 2.5 years. The final panel plots
the local SED
[ t from a four-variable regression specification with coefficients estimated using a product Kernel. The shaded areas represent
periods when SED
[ t > 0, with areas colored in red representing pockets that have less than a 5% chance of being spurious, areas colored in blue
representing pockets that have more than a 5% chance of being spurious. The sampling distributions used to determine spuriousness come from
an EGARCH(1,1) residiual bootstrap design.
0.8

0.6

0.4

0.2
Correlation

-0.2

-0.4

dp tbl tsp rvar pc mv comb1 comb2 comb3


-0.6

-0.8
gy ue ip
Coibion-Gorodnichenko Variables

Figure 3: Correlation of Coibion-Gorodnichenko forecast errors with excess return forecasts.


This table reports correlations between forecast errors of three macroeconomic variables from the Survey of
Professional Forecasters (SPF) with excess return forecasts from our time-varying coefficient models. The
three sets of bar graphs correspond to forecast errors for real GDP growth (gy), the unemployment rate
(ue), and real industrial production growth (ip). The height of the nine colored bars represent correlations
of those forecast errors with the labeled excess return forecasts from our time-varying predictor models.
Each bar is bracketed by a 95% confidence interval computed using HAC standard errors. Since the SPF
respondents send in their forecasts in the middle of each quarter, we only use excess return forecasts from
the first month of each quarter to make the information sets consistent.

57
Web Appendices
A Pockets and Time-Varying Risk Premia
This appendix establishes a set of conditions under which the conventional constant-coefficient re-
turn prediction model (1) holds almost exactly within a fairly general class of endowment economies
nesting many canonical asset pricing specifications considered in the literature. We parameterize
cash flow risks and investor preferences in the economy, allowing for time variation in either the
quantity or the price of risk. To this end, let zt be an L × 1 vector of state variables capturing
the aggregate state of the economy. We assume that this evolves according to the following law of
motion:

Assumption 1 The aggregate state of the economy follows a stationary VAR process:

zt+1 = µ + F zt + t+1 , (A.1)

with z0 given, where the L × L matrix F has all of its eigenvalues inside the unit circle and
E[t+1 ] = 0. Moreover, the log of aggregate dividend growth, ∆dt+1 , equals Sd0 zt+1 for some L × 1
vector Sd .

Assumption 1, which is quite standard, states that aggregate dividend growth can be captured
by a linear combination of the elements of a finite-dimensional, stationary vector autoregressive
process, zt . We will place further restrictions on the vector of innovations below.
In addition to the restrictions on the cash flow process in Assumption 1, we restrict investor
preferences. In particular, Assumption 2 will impose that the log risk-free rate and pricing kernel are
“essentially” affine functions of the zt vector that summarizes the aggregate state of the economy,
possibly with time-varying prices of risk.

Assumption 2 The continuously compounded risk-free rate, rf,t+1 , satisfies

rf,t+1 = A0,f + A0f zt , (A.2)

and the continuously compounded return on any financial asset, ra,t+1 , satisfies the Euler equation

1 = Et [exp(−Λ0t t+1 − log Et exp[−Λ0t t+1 ] + ra,t+1 − rf,t+1 )], (A.3)

where Λt is an L × 1 vector of risk prices.

A large class of models have risk-free rates and pricing kernels which fit into this class. For
example, Assumption 2 holds approximately in a representative agent model where agents have

58
Epstein and Zin (1989) preferences when aggregate consumption growth is also an affine function
of the state vector.51 Thus, our results will apply to many of the specifications considered in
the literature on consumption-based asset pricing models with long-run risks and rare disasters.
This property also holds in an incomplete markets setting with state-dependent higher moments
of uninsurable idiosyncratic shocks.52 We also allow, with some restrictions discussed below, for
time-variation in the price of risk, Λt , which enables our results to nest many models which have
been used to characterize the term structure of interest rates as well as the log-linearized stochastic
discount factor of the Campbell-Cochrane habit formation model.
Finally, we provide two alternative sets of restrictions on risk prices and quantities which ensure
that, up to a log-linear approximation, price-dividend ratios and market returns are exponential
affine functions of zt .53 We also define a partition of the set of state variables zt in a way which
will be useful later.

0 , z 0 ]0 , where dim(z ) = L ≤ L. One of the


Assumption 3 Partition the state vector zt = [z1t 2t 1t 1
following sets of conditions is satisfied:

1. Risk prices are constant: Λt = Λ. In addition, for any γ ∈ RL , Et [t+1 ] = 0 and the
conditional Laplace transform of t+1 satisfies

log Et [exp(γ 0 t+1 )|zt ] = f (γ) + g(γ)0 z1t , (A.4)

where f (γ) : RL → R and g(γ) : RL → RL1


iid
2. Risk prices satisfy Λt = Λ0 + Λ1 z1t , where Λ1 is an L × L1 matrix, and t+1 ∼ M V N (0, Σ),
where Σ is a positive semi-definite matrix.

Assumption 3 characterizes two sets of assumptions which are commonly made to get affine
valuation ratios. In the first case, we assume that risk prices are constant but risk quantities
are time-varying. z1t is the subset of variables (e.g., stochastic volatility and/or Poisson jump
intensities) that are useful for predicting the quantity of risk, while z2t contain additional variables
useful for predicting cash flows or the risk-free rate. We have summarized our main restriction
on the distribution of t+1 in terms of its cumulant generating function, which is the logarithm
of its moment generating function. The affine structure greatly facilitates analytical tractability
51
See, e.g., Bansal and Yaron (2004), Hansen, Heaton and Li (2008), Eraker and Shaliastovich (2008) and Drechsler
and Yaron (2011).
52
See, e.g., Constantinides and Duffie (1996), Constantinides and Ghosh (2017), Schmidt (2016), and Herskovic
et al. (2016).
53
Note that we can get exact exponential affine expressions for the price-dividend ratio and returns of dividend
strips–i.e., the value as of time t of a dividend paid at time t + k for any k–and returns. The linearization is only
necessary because the market return is a weighted average of these individual dividend strip returns which is not
exactly affine in the state vector. Some authors, such as Lettau and van Nieuwerburgh (2008), have elected to work
with the exact dividend strip formulas.

59
and is satisfied for a wide class of distributions used in the theoretical asset pricing literature.54
For instance, suppose that t+1 ∼ M V N (0, σ 2t Σ) for some positive semi-definite matrix Σ. Then
f (γ) = 0 and g(γ)0 z1t = 21 γ 0 Σγ with z1t = σ 2t .
In the second case, we allow for risk prices to be affine in a subset of the state variables, z1t , but
restrict the innovations t+1 to be homoskedastic and multivariate normally-distributed.55 In this
case, z1t indicates the subset of variables which characterize time variation in the price of risk Λt .
These assumptions are quite common in the bond pricing literature as well as for models featuring
time-varying risk aversion and are identical to those in Lustig, van Nieuwerburgh and Verdelhan
(2013), among others.
To solve for asset prices in this economy, we apply the Campbell and Shiller (1988) log-
linearization of the stock market return, rs,t+1 , in excess of the risk-free rate, rf,t+1 , as a function
of the log-dividend growth rate, ∆dt+1 , and the log price-dividend ratios at time t + 1 and t, pdt+1
and pdt :
rs,t+1 ≈ c + ∆dt+1 + ρ · pdt+1 − pdt . (A.5)

Here c and ρ < 1 are linearization constants. Using this linearization and assumptions 1-3, we can
show the following result:

Proposition 1 Suppose Assumptions 1, 2, and 3 are valid and that a solution exists to the log-
linearized asset pricing model. Then, the following properties are satisfied
(i) The market price-dividend ratio is

pdt = A0,m + A0m zt ;

(ii) The expected excess return is

Et [rs,t+1 ] − rf,t+1 = β 0 + β 0 z1t ,

where A0,m , A0,f , β 0 are scalars and Am ∈ RL and β ∈ Rd .

Part (i) of Proposition 1 shows that the log price-dividend ratio is an affine function of the
aggregate state vector, which immediately implies that the log-linearized market return is also
an affine function of zt and t+1 . Part (ii) of the proposition characterizes the extent of return
predictability. It shows that risk premia–expected log excess returns–are an affine function of z1t ,
54
For example, the property holds for affine jump-diffusion models, e.g., Eraker and Shaliastovich (2008) and
Drechsler and Yaron (2011). In these models, t+1 is the sum of Gaussian and jump components and the variance-
covariance matrix for the Gaussian shocks and the arrival intensities for the jump shocks are affine functions of zt . See
also Bekaert and Engstrom (2017) and Creal and Wu (2016) for alternative stochastic processes with affine cumulant
generating functions.
55
Creal and Wu (2016) provide some restrictions which permit both risk prices and quantities to vary while keeping
valuation ratios in the affine class. We do not detail these assumptions here, but note that the constant coefficient
result should obtain for this more general case as well.

60
variables used to forecast cash flows and the risk free rate. For a set of predictors xt chosen to
be elements of the underlying state variables (z1t ), Proposition 1 justifies using constant-coefficient
linear return prediction models of the form in (1).
Part (ii) of Proposition 1 also indicates the extent to which the theory allows for some degree
of dimension reduction. In principle, one could allow for a very large number of state variables
to predict cash flow growth, each of which could have innovations which may even be priced.
Nonetheless, if these variables do not predict time-variation in the quantity of risk (under the
conditions of Assumption 3, part 1) or the price of risk (under the conditions of Assumption 3,
part 2), they may safely be omitted from the predictive regression. On the other hand, if the true
state variables z1t are not spanned by the choice of predictors, xt , included in the return regression,
as could be the case if there are additional drivers of risk prices or quantities omitted from the
regression, it need not necessarily be the case that the projection of rs,t+1 − rf,t+1 on the empirical
proxies would have constant coefficients.
As is the case for many asset pricing tests, it is worth emphasizing that we can only test the
joint hypothesis that the model is correctly specified (i.e., we have the correct predictors) and the
theoretical restrictions (constant coefficients) hold. Thus, an important caveat on interpretations
of our results is that any evidence which is inconsistent with the null of constant coefficients could
potentially be explained by omitted factors, as opposed to our incomplete cash-flow learning story.
Proof. To show part (i) of Proposition 1, we conjecture and verify that the price-dividend ratio
is pdt = A0,m + A0m zt .
By Assumption 1, ∆dt = Sd0 zt . Suppose that Assumption 3.1 holds. Using rs,t+1 ≈ k + ρ(pt+1 −
dt+1 ) + ∆dt+1 + dt − pt , and plugging the log-linearized return into the Euler equation, we have

1 = exp[−A0,f − A0f zt − log Et exp[−Λ0t t+1 ] + κ + (ρ − 1)A0,m − A0m zt ]


× Et exp −Λ0 t+1 + [Sd0 + ρA0m ]zt+1
  

0 = −A0,f − A0f zt + κ + (ρ − 1)A0,m − A0m zt + [Sd0 + ρA0m ](µ + F zt )


+ [f (−Λ0 + Sd0 + ρA0m ) − f (−Λ)] + [g̃(−Λ0 + Sd0 + ρA0m )0 − g̃(−Λ0 )0 ]zt ,

where g̃(u) ≡ [g(u)0 , 00 ]0 and the second line takes logs and applies Assumption 1(ii). Rearranging
yields the (L + 1)-dimensional system of equations in A0,m and Am

f (−Λ + Sd + ρAm ) − f (−Λ) − A0,f + κ + (ρ − 1)A0,m + (Sd0 + ρA0m )µ = 0,


g̃(−Λ + Sd + ρAm ) − g̃(−Λ) − Af − (I − ρF 0 )Am + F 0 Sd = 0.

This system does not have an analytical solution in the general case; however, it is relatively
straightforward to solve the system numerically. We note that Assumption 3.(2) for the data gen-
erating process is identical to those in Lustig, van Nieuwerburgh and Verdelhan (2013). Therefore,
we refer the interested reader to the proof of their Proposition 1 for full derivations of the A0,m

61
and Am coefficients in that case.

To show part (ii), we follow a very similar argument to Drechsler and Yaron (2011). We can
write expected returns as follows (using the normalization Et [t+1 ] = 0):56

Et [exp(rs,t+1 )] = exp[Et rs,t+1 ]Et [exp([Sd0 + ρA0m ]t+1 )] ≡ exp[Et rs,t+1 ]Et [exp(Bm
0
t+1 )]
exp(−rf,t+1 ) ≡ exp[Et mt+1 ]Et [exp(−Λ0t t+1 )].

Next, using the Euler equation in (A.3) and the law of iterated expectations, we have

1 = exp[Et rs,t+1 ] exp[Et mt+1 ]Et exp[(−Λ0t + Bm 0


)t+1 ]
0
Et exp[Bm t+1 ]Et exp[−Λt t+1 ] 0
= exp[Et rs,t+1 ] exp[Et mt+1 ]Et exp[Bm t+1 ]Et exp[−Λ0t t+1 ]
Et exp[(−Λ0t + Bm0 )
t+1 ]
= Et [exp(rs,t+1 )] exp(−rf,t+1 )

0 
Taking logs and noting that Et rs,t+1 = log Et exp(rs,t+1 ) − log Et exp[Bm t+1 ], we get

Et [rs,t+1 ] − rf,t+1 = log Et exp[−Λt t+1 ] − log Et exp[(−Λ0t + Bm


0
)t+1 ]. (A.6)

Suppose that Assumption 3.1 holds. Then, (A.6) simplifies to

0 0
Et [rs,t+1 ] − rf,t+1 = f (−Λ) − f (−Λ + Bm ) + [g(−Λ) − g(−Λ + Bm )]0 z1t , (A.7)

which establishes the claim. If Assumption 3.2 holds, we can evaluate each one of the expressions
in (A.6) using the cumulant generating function of the normal distribution:

0 0 0 0
Et [rs,t+1 ] − rf,t+1 = − 21 Bm ΣBm + Bm ΣΛt = − 12 Bm ΣBm + Bm Σ[Λ0 + Λ1 z1t ], (A.8)

which also establishes the claim. The first term is due to Jensen’s inequality, while the second
captures the covariance between the market return and the priced risk factors. Collecting terms in
front of z1t in the two equations above yields the expressions for β under the two sets of assumptions.

56
Note that this normalization is for convenience. Given our assumptions on the relationship between the distri-
bution of t+1 and the state vector, it will be the case that the mean of t+1 would be affine in zt in the absence of
this normalizing assumption. Therefore, we could always include this additional term in µ and F in equation (A.1).

62
B Details of the Nonparametric Estimation
This appendix describes our nonparametric estimation approach. Robinson (1989) and Cai (2007)
consider local constant and local linear approximations of β respectively, but this approach can
easily be generalized to accommodate polynomials of arbitrary order. In particular, we can ap-
t
proximate the function β t as a pth -order Taylor expansion about the point T (where p ≥ 0). To
this end, define the quantities:
  p 0
s−t s−t
Wst = 1, ,..., ,
T T
 
s−t
Kst = K ,
hT
Qst = Wst ⊗ xs ,

for s, t = 1, . . . , T , where K is a kernel function and h ≡ h (T ) is the bandwidth. More formally,


K : [−1, 1] → R+ is a function that is symmetric about 0 and integrates to 1, and h ∈ [0, 1] satisfies
h → 0 and hT → ∞ as T → ∞.
0
The local polynomial estimator β = β 00 , β 01 , . . . , β 0p is obtained by solving

t+bhT c     p 2
X s−t s−t
min Kst rs+1 − β 00 xs − β 01 xs − . . . − β 0p xs
β∈Rpd T T
s=t−bhT c
t+bhT c
X 2
= Kst rs+1 − β 0 Qst .
s=t−bhT c

Solving this optimization problem for β gives the solution


 −1
t+bT hc t+bT hc
X X
β̂ t =  Kst Qst Q0st  Kst Qst rs+1 . (B.9)
s=t−bT hc s=t−bT hc

Our object of interest, β̂ 1t , is the first element of β̂ t and so is given by

β̂ 1t = e01 ⊗ Id β̂ t ,


where e1 is the first standard basis vector of Rp+1 , Id is a (d × d) identity matrix, and d is the
dimension of xt . This can also be thought of as the OLS estimator of β 0 in the transformed model
p
1/2 1/2
X
Kst ys+1 = Kst x0s β q + εs+1 .
q=0

The asymptotic properties of these estimators are studied in Robinson (1989) and Cai (2007).

63
Under various regularity conditions, it can be shown that the estimator β̂ t in (B.9) is consistent
and asymptotically normal.
Our main empirical results adopt a local constant (Nadarya-Watson) estimation procedure and
so set p = 0. The motivation behind this choice is that the nonparametric procedures require very
large amounts of data to perform well in finite samples and every additional degree of approximation
requires that we estimate dT additional parameters.

64
C Cumsum Plots
To get a sense of how predictive accuracy evolves over time, Figure A.1 follows Welch and Goyal
(2008) and plots the cumulative sum of squared forecast error differentials using real-time forecasts
from our local kernel regressions versus forecasts from the prevailing mean model, i.e.,

t 
X 
CSSEDt = e2τ |τ −1 − eb2τ |τ −1 .
τ =t0

Here e2τ |τ −1 and eb2τ |τ −1 are the squared forecast errors from the prevailing mean and local kernel
regression models, respectively, and t0 is the initial data point in the (out-of-sample) test period.
Positive and increasing values of CSSEDt indicate periods in which the local kernel regression
produces smaller squared forecast errors than the prevailing mean and thus is more accurate;
periods with declining (negative) values show the reverse. Pockets are marked in grey vertical bars.
Panels in the top row show how in-pocket predictive accuracy evolves by letting the CSSEDt
line be flat outside pockets while panels in the bottom row do the reverse, flatlining the CSSEDt
curve in the pockets and tracking how it evolves outside the pockets. For the univariate T-bill rate
model, the CSSEDt curve rises inside most pockets (top panel) while it systematically declines
and is negative outside the pockets (bottom panel).

65
D Stambaugh Bias
In cases where the predictor variable follows a highly persistent process and the correlation between
innovations to the predictor variable and shocks to the return equation is large, Stambaugh (1999)
showed that the estimated slope coefficient in equation (2) can be subject to a potentially large
finite-sample bias. Both conditions are satisfied in our return regression that uses the dividend-price
ratio; in particular, the estimated persistence of the daily dividend-price ratio series is 0.9995.
The Stambaugh bias affects inference based on the estimated slope coefficient β b . However,
t
it does not lead our approach to spuriously identify out-of-sample pockets. Biases in the local
regression estimates of β t will tend to reduce the accuracy of our time-varying return forecasts,
leading to fewer periods in which SEDt > 0 and fewer pockets. Rather than making the pockets
that we identify spurious, this reduces their number.
Still, biases in estimated slope coefficients could affect which pockets get identified through its
effect on our SED
[ t measure so we next explore this point through Monte Carlo simulations.
First, we generate joint standard normal random variables with a correlation of ρr,x :
" # " # " #!
vr,t+1 0 1 ρr,x
∼N , , (D.10)
vx,t+1 0 ρr,x 1

where ρr,x takes values of [−0.5, −0.8, −0.9, −0.95, −0.99]. We next convert these normal draws to
uniform random variables by evaluating the standard normal cdf of each series, {Φ(vr,t+1 ), Φ(vx,t+1 )}.
Let Q̂r and Q̂x denote the empirical quantile functions of the normalized residuals from the es-
timated EGARCH(1,1) model (11), {ûr,t+1 } and {ûx,t+1 }, respectively. We convert the uniform
random variables to bootstrap samplesnof the normalized residuals o by evaluating them at their
respective empirical quantile functions, Q̂r (Φ(vr,t+1 )), Q̂x (Φ(vx,t+1 )) .
Simulation results (reported in Appendix Table A.4) show that the simulated statistical models
become slightly worse at matching the number of pockets and the fraction of significant observations
inside pockets as the correlation parameter is reduced from -0.5 to -0.99. For example, for the
EGARCH model the p-value of the alpha t-statistic decreases from 0.10 when ρr,x = −0.50 to 0.01
when ρr,x = −0.99.
Overall, however, changes in the correlation ρr,x only has a modest effect on the simulation
results. Thus, while clearly the assumed correlation is important to inference about the significance
about predictive return regressions, it matters far less to out-of-sample forecasting performance.
This happens because, in practice, the bias in β
b is small relative to the variation in local return
t
predictability picked up by our local return regressions.

66
E Details of Simulations from Macro-Finance Models with Time-
Varying Risk Premia
In this Appendix, we discuss several details related to the simulation exercises described in Section
5 of the main text, in which we generate sample paths of daily asset returns and state variables
from four workhorse asset pricing models with time-varying risk premia. In each case, we focus on
versions of these models which are solved in continuous time, making it straightforward to discretize
to a daily frequency. Below we provide details associated with the individual models, but we begin
by discussing some features of our analysis that are common across all the models we consider.

E.1 Overview of simulation procedure


In each of the models we consider, standard no-arbitrage conditions hold and, thus, there exists
a unique stochastic discount factor Λt which prices all shocks in each economy. Three of the four
models we consider satisfy assumptions for a representative agent to exist, while the fourth model
has dynamically complete markets. In this latter case, optimal risk sharing conditions pin down
the functional form of the stochastic discount factor.
Characterizing the solution to these models usually proceeds in two steps. First, we need to
characterize the properties of the stochastic discount factor Λt given the primitives of the problem.
For three of the four models we consider, this requires solving a set of partial differential equations.
The functional form for the SDF also allows us to characterize the risk-free rate (except for the rare
disaster model, in which there is a wedge between the short term interest rate and the mean of the
SDF). Second, we price the set of cash flows associated with the stock market, i.e., the aggregate
dividend, Dt . Defining the price-dividend ratio by ξ, the usual no arbitrage argument implies that
it satisfies the equation
Λt Dt dt + Et [d(Λt ξ t Dt )] = 0, (E.11)

which gives us a PDE which characterizes the behavior of the price-dividend ratio, a second state
variable which we consider for our predictive regressions. Then, given the price-dividend ratio, it
is straightforward to compute excess stock returns. Given a time series of realized returns, we can
then compute realized volatility, which gives us a third state variable (rvar) to use in our simulation
exercises. For the first three of the four models under consideration, we use the EconPDE Julia
package developed by Matthieu Gomez, as well as his codes which compute the solutions to each.57
In the final case (Wachter, 2013), we ran replication codes from the original paper kindly shared
with us by Jessica Wachter.
For each of the four models we consider, we generate 1 million years of daily observations (i.e.,
252 million “trading day” observations). Since the model is stationary, we then randomly select a
57
We are extremely grateful to Matthieu Gomez for making these codes available, as they greatly facilitated our
work on this project.

67
starting point in this history from which to extract a daily time series which has the same length
as the sample period which we used for our out-of-sample empirical analyses in the data. We then
run our same codes on these simulated data points in order to assess the extent to which these
models can replicate our evidence.
Next, we discuss the basic setup of the four models we consider. As the analysis is somewhat
more transparent for the Campbell and Cochrane (1999) model, we illustrate each of these steps
fairly explicitly. We proceed analogously for the other models, but refer the readers to the relevant
papers for more explicit characterizations of the associated PDEs.

E.2 Campbell and Cochrane (1999)


We begin by discussing the model of Campbell and Cochrane (1999) where investors have prefer-
ences which feature “habit formation,” which features a single state variable capturing investors’
“habit level” of consumption that generates time variation in the effective risk aversion.58 The
Ct −Xt
model is described by one state variable which is the surplus consumption ratio St = Ct ,
st = log St , where Xt is a reference point for consumption. The surplus ratio enters the utility of
the agent and the SDF has the form Λt = e−ρt (St Ct )−γ . Following a sequence of bad shocks, risk
aversion and risk premia rise, lowering asset prices.
Specifically, aggregate consumption growth follows a geometric Brownian motion with a con-
stant drift. The log surplus ratio is a mean-reverting process with state-dependent volatility σ s :

d log C = µdt + σdWt


ds = −κS (s − s̄) dt + λ(s − s̄) · σ dWt ,
| {z } | {z }
µs σs

γ 1
q p
where s̄ = log S̄, S̄ = σ · κS − γb
. A set of restrictions on state s gives λ(s− s̄) = S̄
1 − 2 · (s − s̄)−
1, s ∈ (−∞, S̄], where smax = s̄ + 12 (1 − S̄ 2 ). For simplicity, we simply assume the stock market
represents a claim on the aggregate stock market. Thus, general equilibrium is established by
Dt = Ct . (For simplicity, we follow this approach rather than characterize the price of a levered
claim on aggregate consumption, which is also standard.)
We then conjecture the following form of the SDF,

κ2
d log Λt = (−r − )dt − κdWt ,
2

and an Ito process for the price-dividend ratio ξ,


= µξ dt + σ ξ dWt .
ξ
58
We use the continuous time version of the calibration from Wachter (2005), which also allows habit to affect the
risk-free interest rate. We refer the reader to that paper for further details.

68
Plugging processes for st and ct := log Ct into the SDF and applying Ito’s Lemma yields equations
for the market price of risk and the interest rate

κ = γ(σ s + σ),
κ2
r = ρ + γ(µs + µ) − .
2
or
γσ p p
κ= 1 − 2 · (s − s̄) = (γκS − b)(1 − 2 · (s − s̄))

κs b
r = ρ + γ(µ − ) + + b(s̄ − s).
2 2

Plugging the processes for Ct , ξ t , and Λt into (E.11), we get

1 σ2
0= − r + µξ + (µ + ) − (σ ξ + σ)κ + σ ξ σ.
ξ 2

Hence, we are looking for a solution of the PDE ξ̇ = 0 with

ξ̇ 1 σ2
= − r + µξ + (µ + ) − (σ ξ + σ)κ + σ ξ σ,
ξ ξ 2

where µξ and σ ξ are identified by Ito’s Lemma

∂ξ µs 1 ∂ 2 ξ σ 2s ∂ξ σ s
µξ = · + · 2· σξ = · .
∂s ξ 2 ∂s ξ ∂s ξ

Our calibration of the associated parameters comes from the solution of the discrete time
Campbell-Cochrane model by Wachter (2005), which are translated to continuous time as follows:

• Consumption growth drift µ = 0.022

• Consumption growth volatility σ = 0.0086

• Relative risk aversion γ = 2

• Rate of time preference parameter ρ = 0.072

• Surplus consumption parameters κs = 0.11 and b = 0.011

The parameters identify S̄ = 0.376 and s̄ = −3.28. We find the solution numerically on a grid
with 798 points.

69
E.3 Bansal and Yaron (2004)
Next, we consider a continuous time version of the long run risk model of Bansal and Yaron (2004),
which is well described in Chen et al. (2009). In the model, investors have recursive preferences,
R ∞ 
Ut = Et t f (cu , Uu )du with aggregator
" −1
#
1 ρc1−ψ
f (c, U ) = − ρ(1 − γ)U ,
1 − ψ −1
−1
[(1 − γ)U ](γ−ψ )/(1−γ)

properties of which are analyzed in Duffie and Epstein (1992). The model is characterized by
processes for the consumption growth drift and stochastic volatility:

dµ = kµ (µ̄ − µ)dt + ν µ v(dWt1 ),

dv = kv (1 − v)dt + ν v vdWt2 ,

dC/C = µdt + v(ν c,1 dWt1 + ν c,3 dWt3 ).

Hence, the model is described by two state variables, µ and v. Following Hansen, Khorrami and
Tourre (2018), who solve a version of the same model in continuous time as part of their mfrSuite
package, we allow shocks to consumption growth and µ to be contemporaneously correlated. For
the stock return, we price a levered claim on aggregate consumption that pays Dt = Ctφ .
Define x := (µ, v). We conjecture Ito Processes for the SDF

dΛt
= −rdt − κµ dWt1 − κv dWt2 − κc dWt3 ,
Λt

and the price-dividend ratio

dξ(x)
= µξ dt + σ ξ,µ dWt1 + σ ξ,v dWt2 .
ξ(x)

Then, we can use the properties of the utility function to characterize a PDE in x which yields the
prices of risk. We omit these details for brevity. Then, we price the aggregate dividend ratio by
applying the no-arbitrage condition above.
The state space has two-dimensions, so we solve it on a two-dimensional grid. The grid Gx is
a product of grids for µ and v, Gx = Gµ ⊗ Gv . We choose a Gµ consisting of 90 points for µ and
choose a Gv consisting of 90 points for v, giving us 8100 grid points in total. We adapt the Julia
codes from the EconPDE package to use the calibrated parameters from Hansen, Khorrami and
Tourre (2018).
Specifically, we assume the following annualized values for each of the parameters above:

• Consumption growth µ̄ = 0.0015 × 12

• Average shock variance v̄ = 1

70
• Annual persistence coefficients kµ = 0.021 × 12 and kv = 0.013 × 12
√ √
• Average shock volatilities ν µ = 0.000344384× 12×12, ν v = 0.038× 12, ν c,1 = 0.000011615×
√ √
12, and ν c,3 = −0.00778202 × 12

• Rate of time preference parameter ρ = 0.024

• Relative risk aversion γ = 7.5 and elasticity of intertemporal substitution ψ = 1.5

E.4 Gârleanu and Panageas (2015)


Gârleanu and Panageas (2015) consider an overlapping generations model with two different types
of agents, with types i = A, B which characterize heterogeneity in their preferences. A agents con-
stitute a smaller fraction of more risk-loving agents.59 Aggregate consumption follows a geometric
Brownian motion with drift. Here, we omit most details of the model because we follow the original
paper, which is already in continuous time, as closely as possible. Agents can write contracts to
share risks associated with fluctuations in the aggregate endowment and have access to a set of
annuity contracts which insure against longevity risk.
For purposes of solving the model, the key is that the model has only one state variable Xt
which is the consumption share of type-A agents, total consumption aggregated over all generations
of type-A agents. Intuitively, the effective level of risk aversion in the economy is lower when Xt is
high, which generates time variation in the price of risk on shocks to the aggregate endowment. Xt
follows an Ito process which is driven by one shock only (the shock to the aggregate endowment)

dXt = µX (Xt )dt + σ X (Xt )dWt .

As in Bansal and Yaron (2004), each type of agent has recursive Duffie and Epstein (1992)
preferences. Lifetime utility of every agent of type-i with wealth W can be represented as
i −1 i
W 1−γ (Xt ) i
− ψ (1−γ )
(1−(ψ i )−1 ) ,
U (W, x) = · g (Xt )
1 − γi

where g i (Xt ) is the consumption-to-wealth ratio.60 For more robust convergence of the finite-
difference method, we solve the model in terms of the inverse of the consumption-to-wealth ratio
ς i (x) = (g i (x))−1 for each type of agent. To define all state dependent parameters we need to solve
4 functions, namely the wealth-to-consumption ratios for both agents {ς(x)i }i=A,B and functions
{φ(x)j }j=1,2 which can be interpreted as capturing the price of a claim on a pre-specified cash
flow.61 Following the solution approach in the paper, we can get a set of PDEs which capture each
59
For instance, we might interpret such agents as entrepreneurs in other models.
CtA
60
e.g., g A = WtA
= (ς A )−1 .
61
Specifically, each captures the price at time t of a claim on Bj × e−(π+δj )(s−t) YYst for every s ≥ t, where Ys is an
aggregate consumption/production, and other parameters come from the paper.

71
one of these functions, prices of risk, and the value of a claim on aggregate capital income. For
simplicity, we assume that the stock market in the model corresponds to the value of an unlevered
claim on capital income.

E.5 Wachter (2013)


Wachter (2013) considers a representative agent economy in which aggregate consumption and
the dividend on the aggregate stock market are exposed to the risk of rare disasters, i.e., large
downward jumps in the aggregate endowment. As in the Bansal-Yaron model above, investors
have Epstein-Zin preferences. (As earlier, we omit expressions for the SDF and PDEs for valuation
ratios in this subsection for brevity.)
Aggregate consumption evolves according to

dCt = µCt− dt + σCt− dBt + (eZt − 1)Ct− dNt , (E.12)

where Bt is a standard Brownian motion and Nt is a Poisson process with a time-varying intensity
λt , which evolves according to
p
dλt = κ(λ̄ − λt )dt + σ λ λt dBλ,t . (E.13)

Bλ,t is also a Brownian motion, and Bt , Bλ,t , and Nt are mutually independent. Dividends are
modeled as a levered claim on consumption, i.e., Dt = Ctφ , where φ = 2.6. In addition, the model
allows for partial default on government debt if a disaster occurs.
Our parameter values and solution approach are identical to those in Wachter (2013); accord-
ingly, we refer the reader to that paper for further technical details.62 Consistent with conventions
in the rare disaster literature, we consider two different sets of simulation exercises: one in which
we include sample paths with disasters and another in which we focus exclusively on sample paths
for which no disaster occurs.
62
We are extremely grateful to Jessica Wachter for kindly providing the replication codes, from which we simulated
data from the model.

72
F Moments of Classical Asset Pricing Models
This appendix discusses the challenges classical asset pricing models face in estimating predictive
regressions at short horizons, especially with fairly short data samples as is the case for our kernel
regressions. Table A.14 helps illustrate these challenges by reporting a number of moments from
the asset pricing models covered in our analysis.
First, there is potential for model misspecification in simple univariate return forecasts, which
occurs because the observed state variable(s) may not map one-to-one into the equity risk premium
and also because this mapping may not be linear. For instance, the price-dividend ratio encodes
information about future expected cash flow growth, risk premia, and real interest rates in addi-
tion to the current equity risk premium, introducing an errors-in-variables problem in univariate
predictive regressions of excess returns on the dividend-price ratio. As an example of this, the
primary driver of the risk premium in the Bansal-Yaron model is a variable capturing stochastic
volatility (ν), which only has a correlation of 14% and -3% with the dp ratio and risk-free rates,
respectively, and a higher though still imperfect 75% correlation with rvar. (In contrast, correla-
tions are quite high with the expected growth rate µ in this calibration.) The Campbell-Cochrane,
Garleanu-Panageas, and Wachter models all feature a single state variable, so only nonlinearities
can bring correlations below unity in absolute value. In any case, the dp ratio tracks the relevant
state variable with correlations often exceeding 90% (Table A.14, Panel A).
Second, the signal-to-noise ratio is extremely low, especially at a daily frequency. To illustrate
this, Panel B in Table A.14 reports the daily R2 associated with univariate regressions of each
of the simulated predictor variables, as well as the true risk premium, for each of the models in
question at various horizons. While there is a modest amount of predictability over the span of
multiple years, signal-to-noise ratios tend to be extremely low at short horizons. Therefore, given
that regressors are quite persistent at a daily frequency, one would generally expect to see very
poor finite sample performance of regressions estimated with only a few years of daily data.
Finally, because many of the models considered here replicate the third property mentioned
above (discount rates vary mostly due to changes in risk premia rather than risk free rates), rises
in risk premia captured by the state variables in each model are associated with sharply negative
realized returns. As a result, to compound the challenges of a low signal-to-noise ratio, the Stam-
baugh (1999) bias is a very serious concern for regressions of returns on the price-dividend ratio,
especially in very short samples. Since many of these models only involve a single state variable,
other variables such as the risk-free rate are often subject to non-trivial biases coming from a cor-
relation between shocks to the predictor variables and realized returns (even if data analogs to the
correlations relevant for assessing the magnitude of the Stambaugh bias suggest that the problem
should be less pronounced for these variables). Predictors are usually quite persistent and there
are often quite strong negative correlations between realized returns and our state variables of in-
terest (Table A.14, Panels C and D). These factors combine to suggest that in-sample predictive

73
regression coefficients would likely be associated with substantial attenuation bias.

74
G Impulse Responses
Figure A.5 simulates impulse responses resulting from a very large shock to zdr,t and zcf,t , respec-
p √
tively. Specifically, we consider the response with a size equal to 252/4 = 63 times the standard
deviation of a daily shock to each variable, which roughly corresponds to the amount of variation
the model would generate in a single quarter. Then, we consider two configurations of the model,
our baseline calibrated model with sticky expectations (we discuss our calibration approach below)
and a rational expectations model with all of the same parameters except that the stickiness pa-
rameter λ is set to zero. Responses to shocks to subjective risk premia (left panel) and orthogonal
shocks to the risk free rate tp,t (not pictured) are identical across the two models. In both cases,
large upward revisions in subjective discount rates trigger large negative return realizations which
are gradually offset by modest increases in expected returns over the medium term. Such a pattern
creates substantial scope for Stambaugh (1999) bias.
In contrast, the two models differ substantially in terms of how expected returns and state
variables respond to a large shock to expected cash flows cf,t (right panel). In the rational expec-
tations model, such a shock generates a one-time, large realized return and a very modest change in
the risk-free rate. However, in the sticky expectations model, responses of both the dividend-price
ratio and risk-free rate are hump-shaped, where the gap between the rational and sticky model
impulse response functions closes essentially to zero within about half a year. These sluggish ad-
justments yield a modest amount of predictability in expected returns which decays towards zero
fairly quickly, contrasting sharply with the spike obtained in the rational expectations model.
It is also easy to see why performance of predictive regressions can be unstable in this environ-
ment. Both dpt and rf,t+1 load linearly, albeit with different weights, on zdr,t , agents’ subjective
beliefs of cash flow growth Ft [∆dt+1 ] = zcf,t − ϑt , as well as other factors. On average, Ft [∆dt+1 ]
is positively correlated with ϑt , since both are moving averages of {cf,t−j }∞
j=0 with strictly posi-
tive weights. Depending on the sequence of shocks experienced, recent level changes in each state
variable may reflect different combinations of these factors at different times.

75
H Calibration of Parameters for Sticky Expectations Model
To assess the potential quantitative importance of a sticky beliefs mechanism in explaining our
empirical results, we calibrate the parameters of the simple model outlined in equations (17-23).
We first fix parameters related to the annualized means of dividend growth, the risk free rate, and
expected returns at 5.3%, 1.5%, and 7%, respectively, to match the sample average of pdt , E[rf,t+1 ],
and E[rt+1 ]. We set the linearization point at E[pdt ] when selecting values of κ and ρ.
Given the central importance of the sticky expectations channel, we discipline parameters gov-
erning the degree of stickiness via external estimates from the literature. In our numerical experi-
ments below, we fix a value of λ ex-ante using empirical results from Coibion and Gorodnichenko
(2015). These authors argue that, in two classes of models of information rigidity, the degree of
information rigidity can be consistently estimated using microdata from professional forecasters.
We take the estimated degree of information rigidity computed using quarterly forecasts of real
output (from Table 6, Column 3 in the paper), choosing a daily information rigidity parameter
which implies a similar degree of mean reversion at a quarterly frequency. Specifically, we set
λ = 0.34/252 ≈ 0.981, which is also similar to the implied degree of rigidity found by Bouchaud
et al. (2019). The associated degree of information rigidity is fairly mild; in particular, if objective
expectations were to increase today by 10%, subjective expectations would already have increased
by about 3.6% within a month and 7% within a quarter.
Likewise, the time series properties of ϑt depend crucially on the extent of objective cash flow
predictability in the model which is governed by the parameters ρcf and Std[cf,t ]. Two related
papers, Schorfheide, Song and Yaron (2018) and Pettenuzzo, Sabbatucci and Timmermann (2020),
both leverage fairly high-frequency data in order to estimate the parameters of a latent, persistent
component in expected dividend growth. Specifically, Schorfheide, Song and Yaron (2018) use cash
flows measured at annual, quarterly, and monthly frequencies, whereas Pettenuzzo, Sabbatucci and
Timmermann (2020) exploit daily data on cash flows for all companies in the US stock market.
Both papers uncover moderately persistent estimates for the AR(1) coefficient of cash flow dynamics
(ρcf ); Pettenuzzo, Sabbatucci and Timmermann (2020) obtain annualized estimates of ρcf ranging
between 0.6 and 0.77, while the posterior median estimate of Schorfheide, Song and Yaron (2018)
(Table VI) is 0.67. That said, their estimates of shock volatilities imply quite different unconditional
volatilities of the persistent component of expected cash flow growth.63 Since our objective function
is fairly flat in the parameter ρcf , we elect to fix the persistence at the posterior median estimate of
Schorfheide, Song and Yaron (2018) but allow other cash flow volatility parameters to be internally
calibrated to match additional volatility and covariance targets.
63
Whereas the Schorfheide, Song and Yaron (2018) calibration implies that the cash flow growth component
has a volatility of around 11% (in annualized units), the Pettenuzzo, Sabbatucci and Timmermann (2020) data
generating process has a smaller unconditional volatility. This difference largely reflects that Pettenuzzo, Sabbatucci
and Timmermann (2020) explicitly filter out jump components in cash flows which naturally reduces the volatility
estimates.

76
All remaining parameters are set to match a sequence of asset pricing moments calculated over
the sample for which we have computed our out-of-sample results.First, we target unconditional
volatilities of daily log excess returns, the annualized one-period risk-free rate, as well as the log
dividend-price ratio. Next, we also seek to match the monthly autocorrelations of the latter two
of these variables, both of which are quite persistent, as well as the correlation between them.64
Finally, we seek to match the full sample OLS coefficients from regressions of log excess returns
on pdt and rft+1 , respectively, as well as the correlations between AR(1) innovations in each of
these variables and forecast errors from these predictive regressions. These additional moments are
intended to ensure that the model generates potential Stambaugh biases which are consistent with
the data so we refer to them as “Stambaugh correlations”. Table A.17 summarizes the calibrated
parameters as well as a comparison of data vs model-implied moments obtained from these exercises.
In general, our calibrated model matches these targets fairly well.
Before discussing our simulations, we pause to discuss what is not targeted in these calibrations.
We deliberately fix the degree of information rigidity based on estimates from the literature, and the
asset pricing moments selected are fairly standard and, as such, not explicitly tied to any evidence
related to pockets of predictability. Therefore, we view our examination of the model’s ability (or
lack thereof) to match evidence related to pockets as a nontargeted validation test of the model.
As additional points of comparison, our simulations below will also consider two alternative
models. The first is a rational expectations version of our model which has the same true cash flow
dynamics but no information rigidities λ = 0. The second is an additional rational expectations
model whose parameters are recalibrated with λ = 0. Since the effects of sticky expectations on
unconditional asset pricing moments are fairly modest, these recalibrated parameters are similar
to those from our baseline model.

64
Rather than directly target the AR(1) coefficients, we compute the absolute value of the difference between the
data and the model-implied half-life of a shock in our objective function that compares data and model-implied
moments.

77
Panel A: Out-of-pocket
Daily Monthly
Statistics dp tbl tsp rvar dp tbl tsp rvar
Mean 0.03 0.03 0.03 0.03 0.79 0.76 0.78 0.72
Standard deviation 1.08 1.09 1.06 1.12 5.35 5.24 5.14 5.38
First-order autocorrelation 0.07 0.05 0.05 0.06 0.12 0.11 0.11 0.13
Skewness 0.03 -0.03 -0.02 0.02 0.58 0.62 0.58 0.64
Kurtosis 19.39 19.24 19.58 18.42 11.78 12.21 12.16 11.83
Panel B: In-pocket
78

Daily Monthly
Statistics dp tbl tsp rvar dp tbl tsp rvar
Mean 0.03 0.03 0.04 0.04 0.31 0.41 -0.24 0.69
Standard deviation 0.89 0.83 0.94 0.78 4.27 4.86 5.66 4.30
First-order autocorrelation 0.04 0.22 0.22 0.12 -0.02 0.15 0.14 0.07
Skewness -0.49 0.09 0.06 -0.39 -0.45 -0.37 -0.17 -0.68
Kurtosis 9.25 5.42 4.66 9.23 3.45 4.44 4.08 3.92

Table A.1: Pocket return statistics. This table reports the mean, standard deviation, autocorrelation, skewness, and kurtosis of excess returns
(measured in percentage points) in- vs. out-of-pocket. Coefficients are estimated using a 1-sided Kernel with a 2.5 year effective sample size and pockets
are determined as periods where a fitted squared forecast error differential (relative to a prevailing mean forecast and estimated using a 1-sided Kernel with
a 1 year effective sample size) is above 0 in the preceding period.
Unrestricted + excess return forecasts All sign restrictions
i.i.d Block EGARCH i.i.d. Block EGARCH i.i.d. Block EGARCH
Statistics Actual Avg. p-val Avg. p-val Avg. p-val Actual Avg. p-val Avg. p-val Avg. p-val Actual Avg. p-val Avg. p-val Avg. p-val
dp
CWF S −0.74 −0.11 0.73 −0.19 0.72 −0.27 0.67 0.40 −0.08 0.33 0.07 0.37 0.02 0.36 0.68 −0.08 0.23 −0.09 0.21 0.04 0.26
CWIP 3.00 −0.11 0.00 0.12 0.00 −0.33 0.00 3.79 −0.06 0.00 0.40 0.00 −0.01 0.00 4.03 −0.05 0.00 0.33 0.00 −0.01 0.00
CWOOP −1.62 −0.07 0.94 −0.38 0.90 −0.11 0.93 −1.94 −0.08 0.97 −0.30 0.95 0.02 0.98 −1.84 −0.08 0.96 −0.37 0.93 0.05 0.98
α̂ 1.69 −0.20 0.06 0.41 0.10 0.17 0.03 2.51 −0.21 0.01 0.52 0.05 0.57 0.08 2.95 −0.29 0.00 0.44 0.03 0.46 0.04
tα̂ 2.10 −0.19 0.01 0.37 0.04 0.28 0.03 2.89 −0.18 0.00 0.40 0.00 0.42 0.01 3.26 −0.24 0.00 0.32 0.00 0.33 0.00
SR 0.47 0.46 0.45 0.46 0.47 0.54 0.66 0.54 0.45 0.27 0.46 0.29 0.56 0.56 0.57 0.45 0.19 0.45 0.23 0.54 0.42
tbl
CWF S 0.68 −0.08 0.23 0.06 0.28 −0.26 0.18 1.98 −0.03 0.02 0.21 0.03 0.01 0.03 2.03 0.03 0.02 0.29 0.03 0.21 0.03
CWIP 3.28 −0.06 0.00 0.22 0.00 −0.25 0.00 4.75 −0.06 0.00 0.32 0.00 −0.06 0.00 4.69 −0.06 0.00 0.34 0.00 0.08 0.00
CWOOP −1.58 −0.08 0.93 −0.18 0.91 −0.18 0.89 −1.33 0.00 0.90 −0.07 0.90 0.05 0.92 −1.21 0.07 0.90 0.04 0.90 0.21 0.92
α̂ 3.57 −0.20 0.00 0.26 0.00 0.67 0.00 6.48 −0.24 0.00 0.31 0.00 1.21 0.00 6.07 −0.29 0.00 0.24 0.00 1.31 0.00
tα̂ 4.35 −0.22 0.00 0.22 0.00 0.97 0.00 5.56 −0.23 0.00 0.24 0.00 1.00 0.00 5.37 −0.28 0.00 0.18 0.00 1.03 0.00
SR 0.79 0.50 0.02 0.50 0.03 0.57 0.08 0.94 0.50 0.00 0.50 0.00 0.57 0.01 0.92 0.50 0.00 0.50 0.00 0.56 0.01
tsp
CWF S 0.15 −0.10 0.41 0.08 0.47 −0.27 0.35 0.95 −0.04 0.17 0.19 0.21 0.15 0.24 0.79 0.04 0.22 0.15 0.27 0.24 0.31
CWIP 3.04 −0.11 0.00 0.19 0.00 −0.28 0.00 4.52 −0.08 0.00 0.27 0.00 0.10 0.00 4.18 −0.03 0.00 0.31 0.00 0.05 0.00
CWOOP −1.52 −0.06 0.93 −0.12 0.91 −0.14 0.91 −1.54 −0.00 0.94 −0.04 0.93 0.10 0.93 −0.21 0.05 0.60 −0.03 0.58 0.24 0.65
79

α̂ 3.14 −0.26 0.01 0.20 0.01 0.48 0.00 5.70 −0.35 0.00 0.17 0.00 0.96 0.00 5.04 −0.24 0.00 0.27 0.00 0.76 0.00
tα̂ 4.26 −0.26 0.00 0.15 0.00 0.68 0.00 4.95 −0.31 0.00 0.11 0.00 0.79 0.00 4.45 −0.22 0.00 0.20 0.00 0.64 0.00
SR 0.77 0.41 0.00 0.41 0.01 0.46 0.02 0.85 0.41 0.00 0.41 0.00 0.45 0.01 0.84 0.40 0.00 0.42 0.00 0.45 0.01
rvar
CWF S −1.49 −0.15 0.90 −0.28 0.91 −0.39 0.88 −0.79 −0.12 0.74 0.05 0.79 −0.16 0.71 −0.44 −0.09 0.64 0.10 0.69 0.12 0.70
CWIP 2.88 −0.14 0.00 −0.06 0.00 −0.41 0.00 3.93 −0.14 0.00 0.24 0.00 −0.17 0.00 3.10 −0.08 0.00 0.28 0.00 −0.11 0.00
CWOOP −1.77 −0.10 0.95 −0.34 0.94 −0.20 0.94 −1.07 −0.05 0.84 −0.21 0.81 −0.08 0.84 −0.97 −0.08 0.81 −0.12 0.81 0.20 0.89
α̂ 2.31 −0.23 0.02 0.65 0.05 0.36 0.01 2.89 −0.26 0.01 0.81 0.05 0.78 0.05 2.69 −0.13 0.02 0.87 0.06 0.57 0.04
tα̂ 3.63 −0.23 0.00 0.64 0.00 0.53 0.00 3.47 −0.24 0.00 0.64 0.00 0.70 0.00 3.03 −0.13 0.00 0.69 0.01 0.51 0.01
SR 0.68 0.45 0.05 0.46 0.07 0.56 0.23 0.71 0.45 0.03 0.46 0.05 0.55 0.15 0.54 0.45 0.25 0.47 0.31 0.55 0.52

Table A.2: OOS statistical model simulations (zero predictability null). This table reports Monte Carlo simulation results for the empirical
1-sided Kernel empirical findings. We consider 3 ways of bootstrapping the fitted residuals from a zero coefficient predictive regression model for excess
returns and an AR(1) model for the predictor: (i) an i.i.d. heteroskedastic bootstrap, (ii) a stationary block bootstrap where the optimal block length
is chosen according to Politis and White (2004), (iii) an EGARCH(1,1) with t-distributed shocks. All residuals are resampled jointly to preserve the
cross-sectional correlation between the innovations to the predictor and excess returns. We generate 1,000 bootstrap samples of the same sample size as is
available for each predictor in the data. A pocket is classified as a period where a fitted squared forecast error differential (estimated using a 1-sided Kernel
with a 1-year effective sample size) is above 0 in the preceding period. We report 6 statistics. The first 3 are Clark and West (2007) t-statistics relative to
a prevailing mean benchmark in the full sample, in-pocket, and out-of-pocket. The second 3 are economic statistics associated with returns on a portfolio
which utilizes the time-varying coefficient model forecast in-pocket and the prevailing mean forecast out-of-pocket to allocate between the risk-free asset
and the market (portfolio weights are limited to be between 0 and 2): the annualized estimated alpha in percentage points, the HAC t-statistic associated
with that estimated alpha, and the annualized Sharpe Ratio of the portfolio. Column 2 presents the corresponding statistics from the data for reference.
Pocket num dp tbl tsp rvar
0.05 3.93∗∗∗ 0.28 0.61∗
1
(0.42) (0.00) (0.13) (0.07)
0.49 0.86∗∗ 0.69∗∗ 1.90∗∗∗
2
(0.12) (0.05) (0.05) (0.01)
0.37 9.29∗∗∗ 7.54∗∗∗ 0.49∗
3
(0.15) (0.00) (0.00) (0.09)
4.76∗∗∗ 0.54∗ 5.87∗∗∗ 2.54∗∗∗
4
(0.00) (0.08) (0.00) (0.01)
3.69∗∗∗ 8.73∗∗∗ 1.77∗∗∗ 1.22∗∗
5
(0.00) (0.00) (0.01) (0.03)
0.06 1.93∗∗∗ 1.68∗∗∗ 16.42∗∗∗
6
(0.38) (0.01) (0.01) (0.00)
0.29 11.69∗∗∗ 2.62∗∗∗ 1.88∗∗∗
7
(0.17) (0.00) (0.00) (0.01)
-0.24 -0.24 4.70∗∗∗
8
(1.00) (1.00) (0.00)
1.92∗∗∗ 3.47∗∗∗ 1.48∗∗
9
(0.01) (0.00) (0.02)
0.33 2.42∗∗∗ -0.87
10
(0.16) (0.01) (1.00)
0.13 1.05∗∗ 5.22∗∗∗
11
(0.27) (0.03) (0.00)
3.90∗∗∗ 0.72∗ 3.45∗∗∗
12
(0.00) (0.06) (0.00)
0.90∗ 0.94∗∗
13
(0.06) (0.04)
2.92∗∗∗ 0.49∗
14
(0.00) (0.09)
1.68∗∗ 1.93∗∗∗
15
(0.02) (0.01)
0.22 1.92∗∗∗
16
(0.21) (0.01)
4.56∗∗∗
17
(0.00)
1.11∗∗
18
(0.05)

Table A.3: Individual pocket p-values. This table reports p-values for individual pockets estimated using the
unrestricted forecasts from the time-varying coefficient model. p-values are computed as the fraction of pockets from
the EGARCH(1,1) model with t-distributed shocks model that have integral R2 values greater than the integral R2
from the individual pocket in the data.

80
i.i.d. Block EGARCH
Stats Actual Avg. Std. err. p-val Avg. Std. err. p-val Avg. Std. err. p-val
Correlation -0.5
CWF S −0.74 −0.03 1.04 0.77 0.13 0.99 0.82 −0.32 0.96 0.67
CWIP 3.00 −0.06 1.02 0.00 0.36 0.96 0.00 −0.35 0.95 0.00
CWOOP −1.62 −0.01 1.04 0.94 −0.20 0.99 0.92 −0.15 1.00 0.93
α̂ 1.69 −0.15 1.19 0.06 0.84 1.10 0.22 0.47 0.90 0.08
tα̂ 2.10 −0.15 1.03 0.01 0.72 0.96 0.07 0.65 1.10 0.10
SR 0.47 0.40 0.12 0.30 0.45 0.14 0.43 0.47 0.15 0.49
Correlation -0.8
CWF S −0.74 −0.10 1.05 0.72 −0.25 0.95 0.69 −0.32 0.96 0.65
CWIP 3.00 −0.12 1.01 0.00 0.07 0.96 0.00 −0.37 0.90 0.00
CWOOP −1.62 −0.06 1.06 0.93 −0.41 0.97 0.90 −0.13 1.03 0.92
α̂ 1.69 −0.19 1.18 0.06 0.35 1.05 0.12 0.11 0.81 0.03
tα̂ 2.10 −0.18 1.03 0.01 0.31 0.96 0.03 0.20 0.99 0.03
SR 0.47 0.40 0.10 0.27 0.44 0.11 0.39 0.45 0.12 0.45
Correlation -0.9
CWF S −0.74 −0.12 1.03 0.74 −0.45 0.96 0.60 −0.35 0.98 0.66
CWIP 3.00 −0.15 0.97 0.00 −0.17 0.90 0.00 −0.34 0.92 0.00
CWOOP −1.62 −0.06 1.04 0.93 −0.45 0.99 0.89 −0.20 1.06 0.91
α̂ 1.69 −0.21 1.13 0.04 −0.01 1.01 0.04 −0.06 0.84 0.02
tα̂ 2.10 −0.19 0.97 0.01 −0.01 0.93 0.01 0.01 1.01 0.02
SR 0.47 0.40 0.09 0.24 0.43 0.10 0.35 0.45 0.11 0.45
Correlation -0.95
CWF S −0.74 −0.14 1.00 0.71 −0.55 0.94 0.57 −0.35 0.98 0.65
CWIP 3.00 −0.15 0.99 0.00 −0.29 0.95 0.00 −0.35 0.91 0.00
CWOOP −1.62 −0.10 1.02 0.93 −0.48 0.98 0.89 −0.19 1.03 0.92
α̂ 1.69 −0.21 1.16 0.04 −0.23 1.03 0.04 −0.16 0.86 0.02
tα̂ 2.10 −0.19 1.01 0.01 −0.22 0.95 0.01 −0.14 1.01 0.01
SR 0.47 0.40 0.09 0.22 0.43 0.09 0.34 0.45 0.11 0.44
Correlation -0.99
CWF S −0.74 −0.18 1.00 0.71 −0.63 0.96 0.54 −0.35 0.99 0.64
CWIP 3.00 −0.18 1.00 0.00 −0.37 0.94 0.00 −0.33 0.90 0.00
CWOOP −1.62 −0.11 1.01 0.93 −0.53 1.00 0.86 −0.20 1.00 0.92
α̂ 1.69 −0.23 1.16 0.05 −0.37 1.00 0.02 −0.25 0.85 0.01
tα̂ 2.10 −0.20 1.00 0.01 −0.36 0.93 0.01 −0.24 0.98 0.01
SR 0.47 0.40 0.08 0.21 0.44 0.09 0.33 0.45 0.10 0.44

Table A.4: Correlation robustness of dividend-price ratio results. This table reports Monte Carlo simu-
lation results for the 1-sided Kernel empirical findings for the dp model. We consider 3 ways of bootstrapping the
fitted residuals from a constant coefficient predictive regression model and an AR(1) model for the predictor: (i) an
i.i.d. heteroskedastic bootstrap, (ii) a stationary block bootstrap where the optimal block length is chosen according
to Politis and White (2004), (iii) an EGARCH(1,1) with t-distributed shocks. All fitted residuals are made uniform
using their empirical CDFs and then resampled using a Gaussian copula to achieve a particular correlation before
transforming back. A pocket is classified as a period where a fitted squared forecast error differential (estimated
using a 1-sided Kernrel with a 1-year effective sample size) is above 0 in the preceding period. We report 6 statis-
tics. The first 3 are Clark-West t-statistics relative to a prevailing mean benchmark in the full sample, in pockets,
and out of pockets. The second 3 are economic statistics associated with returns on a portfolio which utilizes the
time-varying coefficient model forecast in-pocket and the prevailing mean forecast out-of-pocket to allocate between
the risk-free asset and the market (portfolio weights are limited to be between 0 and 2): the annualized estimated
alpha in percentage points, the HAC t-statistic associated with that alpha, and the annualized Sharpe Ratio of the
portfolio. Column 2 presents the corresponding statistics from the data for reference.

81
Panel A: Clark-West statistics
Unrestricted + excess return forecasts All sign restrictions
Variables Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket
dp 0.01 3.79∗∗∗ −1.24 1.52∗ 4.50∗∗∗ −1.12 1.96∗∗ 4.84∗∗∗ −0.84
tbl 1.32∗ 4.52∗∗∗ −2.46††† 2.64∗∗∗ 4.04∗∗∗ −0.56 2.98∗∗∗ 4.62∗∗∗ −0.68
tsp 0.85 3.58∗∗∗ −1.67†† 1.64∗ 3.52∗∗∗ −0.99 0.30 3.37∗∗∗ −0.75
rvar −1.51† 2.88∗∗∗ −1.87†† −0.69 2.57∗∗∗ −1.35† −0.33 2.82∗∗∗ −1.19
mv −0.70 4.21∗∗∗ −1.36† 0.34 4.42∗∗∗ −1.18 0.34 4.42∗∗∗ −1.18
pc 1.77∗∗ 3.97∗∗∗ −1.17 2.50∗∗∗ 3.93∗∗∗ −0.15 2.50∗∗∗ 3.93∗∗∗ −0.15
comb1 5.58∗∗∗ 5.72∗∗∗ – 5.33∗∗∗ 5.40∗∗∗ – 5.80∗∗∗ 5.85∗∗∗ –
comb2 6.17∗∗∗ 6.33∗∗∗ – 6.05∗∗∗ 6.12∗∗∗ – 6.54∗∗∗ 6.59∗∗∗ –
comb3 −0.75 0.82 −1.89†† 1.13 1.64∗ −1.48† 1.25 1.48∗ −0.61
Panel B: Economic significance
Unrestricted + excess return forecasts All sign restrictions
Variables α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio
dp 2.02∗∗ 1.71 0.44 2.91∗∗ 2.10 0.47 3.09∗∗ 2.28 0.48
tbl 3.66∗∗∗ 3.04 0.61 4.60∗∗∗ 3.07 0.60 5.16∗∗∗ 3.36 0.63
tsp 2.49∗∗ 2.04 0.50 4.17∗∗∗ 2.66 0.55 3.14∗∗ 1.94 0.49
82

rvar 1.57∗ 1.49 0.44 2.20∗ 1.59 0.45 2.06∗∗ 1.73 0.45
mv 3.23∗∗∗ 2.83 0.58 5.32∗∗∗ 3.88 0.71 5.32∗∗∗ 3.88 0.71
pc 3.27∗∗∗ 2.62 0.56 4.53∗∗∗ 2.90 0.57 4.53∗∗∗ 2.90 0.57
comb1 3.40∗∗∗ 2.98 0.58 5.29∗∗∗ 3.45 0.63 5.34∗∗∗ 3.50 0.65
comb2 5.36∗∗∗ 4.44 0.71 6.98∗∗∗ 5.00 0.79 7.02∗∗∗ 5.14 0.78
comb3 2.34∗ 1.32 0.46 2.34∗∗ 1.87 0.46 2.72∗∗ 2.07 0.48
lpm 0.04 0.03 0.38 0.32 0.18 0.38 0.32 0.18 0.38

Table A.5: Out-of-sample measures of forecasting performance (daily, pockets identified relative to a local prevailing mean (lpm)
benchmark). Panel A reports the Clark and West (2007) test statistics for out-of-sample return predictability measured relative to a local prevailing mean
forecast. Panel B reports 3 measures of economic significance associated with returns on a portfolio which utilizes the time-varying coefficient model forecast
in-pocket and the local prevailing mean forecast out-of-pocket to allocate between the risk-free asset and the market (portfolio weights are limited to be
between 0 and 2): the annualized estimated alpha in percentage points, the HAC t-statistic for the estimated alpha, and the annualized Sharpe Ratio of
the portfolio. We use a purely backward-looking kernel with an effective sample size of 2.5 years to compute forecasts. “pc” is a recursively computed first
principal component of the four predictor variables. “mv” is a four-variable multivariate forecast estimated using a product Kernel. “comb1,” “comb2,”
and “comb3” refer to using a simple average of the univariate forecasts. “comb1” sets an individual predictor’s forecast to the time-varying coefficient
model forecast during a pocket and to the prevailing mean otherwise. “comb2” is the same as “comb1” except that it ignores individual predictor forecasts
when that variable is not in a pocket but at least one other variable is in a pocket. “comb3” makes no distinction between in-pocket and out-of-pocket
periods and always uses the simple equal-weighted average of all four univariate models. The CW test statistics approximately follow a normal distribution
with positive values indicating more accurate out-of-sample return forecasts than the local prevailing mean benchmark and negative values indicating the
opposite. A pocket is classified as a period where a fitted squared forecast error differential (estimated using a 1-sided Kernel with a 1-year effective sample
size) is above 0 in the preceding period. Consider a particular statistic of interest, β. ∗’s represent statistical significance at either the 10, 5, or 1% levels
from a hypothesis test of β > 0. †’s represent statistical significance at either the 10, 5, or 1% levels from a hypothesis test of β < 0.
Panel A: Clark-West statistics
Unrestricted + excess return forecasts All sign restrictions
Variables Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket
dp 2.21∗∗ 4.98∗∗∗ −0.60 2.62∗∗∗ 4.53∗∗∗ −1.32† 1.13 4.14∗∗∗ −3.01†††
tbl 1.57∗ 2.81∗∗∗ −0.40 1.79∗∗ 2.85∗∗∗ −1.21 2.40∗∗∗ 4.47∗∗∗ −1.26
tsp 1.02 2.01∗∗ −0.80 1.00 2.73∗∗∗ 0.09 0.46 4.93∗∗∗ −0.20
rvar 0.70 3.01∗∗∗ −0.33 0.90 2.64∗∗∗ −2.24†† 1.04 3.58∗∗∗ −2.10††
mv 2.71∗∗∗ 3.43∗∗∗ 1.46∗ 2.23∗∗ 4.29∗∗∗ −0.68 1.65∗∗ 3.70∗∗∗ −0.69
pc 1.60∗ 2.23∗∗ 0.08 1.64∗ 2.42∗∗∗ 0.53 1.04 4.64∗∗∗ −1.25
comb1 3.81∗∗∗ 4.03∗∗∗ – 4.33∗∗∗ 4.39∗∗∗ – 4.82∗∗∗ 5.13∗∗∗ –
comb2 4.64∗∗∗ 4.97∗∗∗ – 5.45∗∗∗ 5.57∗∗∗ – 5.48∗∗∗ 5.90∗∗∗ –
comb3 1.99∗∗ 2.21∗∗ 0.50 2.46∗∗∗ 2.71∗∗∗ −1.79†† 1.79∗∗ 2.12∗∗ −0.93
Panel B: Economic significance
Unrestricted + excess return forecasts All sign restrictions
Variables α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio
dp 3.84∗∗∗ 3.12 0.59 3.74∗∗∗ 2.58 0.53 3.99∗∗∗ 2.94 0.56
tbl 2.10∗ 1.49 0.46 3.96∗∗ 2.28 0.56 5.87∗∗∗ 3.81 0.71
tsp 1.59 1.18 0.43 2.08 1.10 0.42 1.69 0.90 0.41
rvar 2.12∗∗ 1.67 0.47 2.21∗ 1.44 0.45 3.20∗∗ 2.02 0.51
mv 2.81∗∗ 2.17 0.51 4.41∗∗∗ 3.16 0.60 4.41∗∗∗ 3.16 0.60
pc 1.88∗ 1.30 0.45 2.38∗ 1.30 0.44 2.38∗ 1.30 0.44
83

comb1 3.73∗∗∗ 2.73 0.60 4.66∗∗∗ 2.66 0.59 5.66∗∗∗ 3.36 0.67
comb2 6.22∗∗∗ 4.69 0.80 8.15∗∗∗ 5.34 0.90 8.94∗∗∗ 5.73 0.93
comb3 1.41∗ 1.53 0.47 2.23∗ 1.35 0.45 4.48∗∗∗ 3.22 0.62
lpm −0.21 −0.16 0.38 −0.16 −0.09 0.38 −0.16 −0.09 0.38

Table A.6: Out-of-sample measures of forecasting performance (monthly, pockets identified relative to a local prevailing mean (lpm)
benchmark). Panel A reports the Clark and West (2007) test statistics for out-of-sample return predictability measured relative to a local prevailing mean
forecast. Panel B reports 3 measures of economic significance associated with returns on a portfolio which utilizes the time-varying coefficient model forecast
in-pocket and the local prevailing mean forecast out-of-pocket to allocate between the risk-free asset and the market (portfolio weights are limited to be
between 0 and 2): the annualized estimated alpha in percentage points, the HAC t-statistic for the estimated alpha, and the annualized Sharpe Ratio of
the portfolio. We use a purely backward-looking kernel with an effective sample size of 2.5 years to compute forecasts. “pc” is a recursively computed first
principal component of the four predictor variables. “mv” is a four-variable multivariate forecast estimated using a product Kernel. “comb1,” “comb2,”
and “comb3” refer to using a simple average of the univariate forecasts. “comb1” sets an individual predictor’s forecast to the time-varying coefficient
model forecast during a pocket and to the prevailing mean otherwise. “comb2” is the same as “comb1” except that it ignores individual predictor forecasts
when that variable is not in a pocket but at least one other variable is in a pocket. “comb3” makes no distinction between in-pocket and out-of-pocket
periods and always uses the simple equal-weighted average of all four univariate models. The CW test statistics approximately follow a normal distribution
with positive values indicating more accurate out-of-sample return forecasts than the local prevailing mean benchmark and negative values indicating the
opposite. A pocket is classified as a period where a fitted squared forecast error differential (estimated using a 1-sided Kernel with a 1-year effective sample
size) is above 0 in the preceding period. Consider a particular statistic of interest, β. ∗’s represent statistical significance at either the 10, 5, or 1% levels
from a hypothesis test of β > 0. †’s represent statistical significance at either the 10, 5, or 1% levels from a hypothesis test of β < 0.
Panel A: Clark-West statistics
Unrestricted + excess return forecasts All sign restrictions
Variables Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket
dp 0.01 3.11∗∗∗ −0.91 1.52∗ 4.22∗∗∗ −0.37 1.96∗∗ 4.30∗∗∗ 0.02
tbl 1.32∗ 3.44∗∗∗ −1.04 2.64∗∗∗ 3.84∗∗∗ 0.11 2.98∗∗∗ 4.10∗∗∗ 0.45
tsp 0.85 3.38∗∗∗ −1.31† 1.64∗ 3.43∗∗∗ −0.34 0.30 3.34∗∗∗ −0.55
rvar −1.51† 3.21∗∗∗ −1.72†† −0.69 2.66∗∗∗ −0.85 −0.33 2.29∗∗ −0.71
mv −0.69 3.35∗∗∗ −1.18 0.34 4.10∗∗∗ −1.05 0.34 4.10∗∗∗ −1.05
pc 1.77 2.73∗∗∗ 0.25 2.50∗∗∗ 3.43∗∗∗ 0.84 2.50∗∗∗ 3.43∗∗∗ 0.84
comb1 4.84∗∗∗ 4.95∗∗∗ – 5.75∗∗∗ 5.89∗∗∗ – 5.90∗∗∗ 6.01∗∗∗ –
comb2 5.52∗∗∗ 5.66∗∗∗ – 6.12∗∗∗ 6.29∗∗∗ – 6.47∗∗∗ 6.65∗∗∗ –
comb3 −0.75 3.38∗∗∗ −1.99†† 1.13 1.11 0.20 1.25 2.62∗∗∗ −0.59
Panel B: Economic significance
Unrestricted + excess return forecasts All sign restrictions
Variables α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio
dp 2.06∗∗ 1.74 0.45 2.31∗ 1.56 0.44 2.15∗ 1.48 0.44
tbl 2.57∗∗ 2.29 0.52 3.98∗∗∗ 2.55 0.55 4.23∗∗∗ 2.68 0.57
tsp 2.52∗∗ 2.20 0.51 3.59∗∗ 2.26 0.50 2.82∗∗ 1.70 0.47
84

rvar 0.98 0.87 0.40 1.15 0.63 0.39 0.92 0.67 0.39
mv 2.51∗∗ 2.22 0.51 4.79∗∗∗ 3.53 0.67 4.79∗∗∗ 3.53 0.67
pc 2.21∗∗ 1.86 0.47 2.83∗∗ 1.69 0.46 2.83∗∗∗ 1.69 0.46
comb1 2.93∗∗∗ 2.55 0.54 4.68∗∗∗ 2.82 0.56 4.49∗∗∗ 2.69 0.56
comb2 5.23∗∗∗ 4.40 0.72 7.79∗∗∗ 5.39 0.84 7.37∗∗∗ 4.98 0.81
comb3 0.76∗ 1.32 0.43 2.34∗∗ 1.87 0.46 2.72∗∗ 2.07 0.48
lpm 0.04 0.03 0.38 0.32 0.18 0.38 0.32 0.18 0.38

Table A.7: Out-of-sample measures of forecasting performance (daily, pockets identified relative to a global prevailing mean bench-
mark). Panel A reports the Clark and West (2007) test statistics for out-of-sample return predictability measured relative to a local prevailing mean
forecast. Panel B reports 3 measures of economic significance associated with returns on a portfolio which utilizes the time-varying coefficient model forecast
in-pocket and the local prevailing mean forecast out-of-pocket to allocate between the risk-free asset and the market (portfolio weights are limited to be
between 0 and 2): the annualized estimated alpha in percentage points, the HAC t-statistic for the estimated alpha, and the annualized Sharpe Ratio of
the portfolio. We use a purely backward-looking kernel with an effective sample size of 2.5 years to compute forecasts. “pc” is a recursively computed first
principal component of the four predictor variables. “mv” is a four-variable multivariate forecast estimated using a product Kernel. “comb1,” “comb2,”
and “comb3” refer to using a simple average of the univariate forecasts. “comb1” sets an individual predictor’s forecast to the time-varying coefficient
model forecast during a pocket and to the prevailing mean otherwise. “comb2” is the same as “comb1” except that it ignores individual predictor forecasts
when that variable is not in a pocket but at least one other variable is in a pocket. “comb3” makes no distinction between in-pocket and out-of-pocket
periods and always uses the simple equal-weighted average of all four univariate models. The CW test statistics approximately follow a normal distribution
with positive values indicating more accurate out-of-sample return forecasts than the local prevailing mean benchmark and negative values indicating the
opposite. A pocket is classified as a period where a fitted squared forecast error differential (estimated using a 1-sided Kernel with a 1-year effective sample
size) is above 0 in the preceding period. Consider a particular statistic of interest, β. ∗’s represent statistical significance at either the 10, 5, or 1% levels
from a hypothesis test of β > 0. †’s represent statistical significance at either the 10, 5, or 1% levels from a hypothesis test of β < 0.
Panel A: Clark-West statistics
Unrestricted + excess return forecasts All sign restrictions
Variables Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket
dp 2.21∗∗ 4.10∗∗∗ 0.53 2.62∗∗∗ 4.22∗∗∗ 0.32 2.98∗∗∗ 4.22∗∗∗ 0.78
tbl 1.57∗ 2.54∗∗∗ 0.08 1.79∗∗ 2.44∗∗∗ 0.26 3.02∗∗∗ 3.37∗∗∗ 1.07
tsp 1.02 2.37∗∗∗ −0.14 1.00 2.59∗∗∗ 0.33 0.14 1.46∗ −0.05
rvar 0.70 2.98∗∗∗ −0.10 0.90 2.66∗∗∗ −2.30†† 1.39∗ 3.22∗∗∗ −1.87††
mv 2.71∗∗∗ 2.98∗∗∗ 2.20∗∗ 2.23∗∗ 3.95∗∗∗ −0.16 2.23∗∗ 3.95∗∗∗ −0.16
pc 1.60∗ 2.94∗∗∗ 0.07 1.64∗ 2.40∗∗∗ 0.58 1.64∗ 2.40∗∗∗ 0.58
comb1 3.87∗∗∗ 4.31∗∗∗ – 3.61∗∗∗ 3.80∗∗∗ – 3.78∗∗∗ 3.98∗∗∗ –
comb2 4.20∗∗∗ 4.74∗∗∗ – 4.33∗∗∗ 4.59∗∗∗ – 4.92∗∗∗ 5.28∗∗∗ –
comb3 1.99∗∗ 2.40∗∗∗ −0.84 2.46∗∗∗ 2.33∗∗∗ 0.71 2.44∗∗∗ 2.04∗∗ 1.33∗
Panel B: Economic significance
Unrestricted + excess return forecasts All sign restrictions
Variables α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio
dp 2.07∗ 1.51 0.45 2.80∗ 1.60 0.46 2.80∗ 1.60 0.46
tbl 1.82 1.28 0.45 2.47∗ 1.39 0.45 4.09∗∗ 2.29 0.55
tsp 0.73 0.61 0.39 1.65 0.84 0.41 0.62 0.34 0.38
rvar 1.45 1.13 0.42 2.25∗ 1.47 0.45 3.11∗∗ 1.99 0.50
mv 0.79 0.63 0.39 3.61∗∗∗ 2.55 0.54 3.61∗∗∗ 2.55 0.54
pc 1.70 1.26 0.44 1.93 1.00 0.42 1.93 1.00 0.42
85

comb1 2.55∗∗ 1.95 0.50 3.53∗∗ 1.93 0.50 4.10∗∗ 2.26 0.54
comb2 5.06∗∗∗ 3.44 0.72 6.17∗∗∗ 3.87 0.74 7.17∗∗∗ 4.09 0.79
comb3 1.41∗ 1.53 0.47 2.23∗ 1.35 0.45 4.48∗∗∗ 3.22 0.62
lpm −0.21 −0.16 0.38 −0.16 −0.09 0.38 −0.16 −0.09 0.38

Table A.8: Out-of-sample measures of forecasting performance (monthly, pockets identified relative to a global prevailing mean mean
benchmark). Panel A reports the Clark and West (2007) test statistics for out-of-sample return predictability measured relative to a local prevailing mean
forecast. Panel B reports 3 measures of economic significance associated with returns on a portfolio which utilizes the time-varying coefficient model forecast
in-pocket and the local prevailing mean forecast out-of-pocket to allocate between the risk-free asset and the market (portfolio weights are limited to be
between 0 and 2): the annualized estimated alpha in percentage points, the HAC t-statistic for the estimated alpha, and the annualized Sharpe Ratio of
the portfolio. We use a purely backward-looking kernel with an effective sample size of 2.5 years to compute forecasts. “pc” is a recursively computed first
principal component of the four predictor variables. “mv” is a four-variable multivariate forecast estimated using a product Kernel. “comb1,” “comb2,”
and “comb3” refer to using a simple average of the univariate forecasts. “comb1” sets an individual predictor’s forecast to the time-varying coefficient
model forecast during a pocket and to the prevailing mean otherwise. “comb2” is the same as “comb1” except that it ignores individual predictor forecasts
when that variable is not in a pocket but at least one other variable is in a pocket. “comb3” makes no distinction between in-pocket and out-of-pocket
periods and always uses the simple equal-weighted average of all four univariate models. The CW test statistics approximately follow a normal distribution
with positive values indicating more accurate out-of-sample return forecasts than the local prevailing mean benchmark and negative values indicating the
opposite. A pocket is classified as a period where a fitted squared forecast error differential (estimated using a 1-sided Kernel with a 1-year effective sample
size) is above 0 in the preceding period. Consider a particular statistic of interest, β. ∗’s represent statistical significance at either the 10, 5, or 1% levels
from a hypothesis test of β > 0. †’s represent statistical significance at either the 10, 5, or 1% levels from a hypothesis test of β < 0.
Panel A: Clark-West statistics
Unrestricted + excess return forecasts All sign restrictions
Variables Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket Full sample In-pocket Out-of-pocket
dp −0.74 2.11∗∗ −1.26 0.40 1.99∗∗ −0.37 0.68 2.20∗∗ −0.18
tbl 0.68 1.29∗ 0.11 1.98∗∗ 3.39∗∗∗ 0.07 2.03∗∗ 3.57∗∗∗ −0.02
tsp 0.15 1.83∗∗ −0.52 0.95 3.00∗∗∗ −0.31 0.79 2.73∗∗∗ 0.30
rvar −1.49† 2.61∗∗∗ −1.64† −0.79 3.04∗∗∗ −0.91 −0.44 2.39∗∗∗ −0.59
mv −0.99 2.93∗∗∗ −1.36† −0.01 3.45∗∗∗ −1.02 −0.01 3.45∗∗∗ −1.02
pc 0.99 1.01 0.62 1.85∗∗ 3.54∗∗∗ 0.54 1.85∗∗ 3.54∗∗∗ 0.54
comb1 3.05∗∗∗ 3.02∗∗∗ – 4.58∗∗∗ 4.63∗∗∗ – 4.53∗∗∗ 4.60∗∗∗ –
comb2 3.31∗∗∗ 3.30∗∗∗ – 4.67∗∗∗ 4.79∗∗∗ – 4.62∗∗∗ 4.74∗∗∗ –
comb3 −1.03 −0.18 −1.00 0.23 −0.12 0.63 0.66 1.29 0.05
Panel B: Economic significance
Unrestricted + excess return forecasts All sign restrictions
Variables α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio
dp 0.67 0.91 0.40 0.68 1.06 0.40 0.97∗ 1.44 0.42
tbl 1.83∗∗∗ 2.63 0.55 2.70∗∗∗ 3.66 0.69 2.83∗∗∗ 3.62 0.70
tsp 1.86∗∗∗ 2.95 0.58 2.28∗∗∗ 3.13 0.61 1.68∗∗∗ 2.90 0.56
86

rvar 0.91∗∗ 1.95 0.48 0.87∗∗ 1.80 0.47 0.94∗∗ 1.69 0.45
mv 2.36∗∗∗ 3.17 0.63 3.81∗∗∗ 4.32 0.78 3.81∗∗∗ 4.32 0.78
pc 1.65∗∗ 2.32 0.52 2.18∗∗∗ 3.36 0.65 2.18∗∗∗ 3.36 0.65
comb1 3.86∗∗∗ 3.91 0.67 3.01∗∗∗ 4.72 0.76 2.96∗∗∗ 4.56 0.73
comb2 3.86∗∗∗ 3.91 0.65 4.80∗∗∗ 4.70 0.74 4.54∗∗∗ 4.62 0.72
comb3 0.76∗ 1.32 0.43 2.35∗∗ 1.88 0.46 2.73∗∗ 2.08 0.48

Table A.9: Out-of-sample measures of forecasting performance (daily, pockets identifed relative to a global prevailing mean that are
not identified by the local prevailing mean). Panel A reports the Clark and West (2007) test statistics for out-of-sample return predictability measured
relative to a global prevailing mean forecast. Panel B reports 3 measures of economic significance associated with returns on a portfolio which utilizes the
time-varying coefficient model forecast in-pocket and the global prevailing mean forecast out-of-pocket to allocate between the risk-free asset and the market
(portfolio weights are limited to be between 0 and 2): the annualized estimated alpha in percentage points, the HAC t-statistic for the estimated alpha, and
the annualized Sharpe Ratio of the portfolio. We use a purely backward-looking kernel with an effective sample size of 2.5 years to compute forecasts. “pc”
is a recursively computed first principal component of the four predictor variables. “mv” is a four-variable multivariate forecast estimated using a product
Kernel. “comb1,” “comb2,” and “comb3” refer to using a simple average of the univariate forecasts. “comb1” sets an individual predictor’s forecast to
the time-varying coefficient model forecast during a pocket and to the prevailing mean otherwise. “comb2” is the same as “comb1” except that it ignores
individual predictor forecasts when that variable is not in a pocket but at least one other variable is in a pocket. “comb3” makes no distinction between
in-pocket and out-of-pocket periods and always uses the simple equal-weighted average of all four univariate models. The CW test statistics approximately
follow a normal distribution with positive values indicating more accurate out-of-sample return forecasts than the global prevailing mean benchmark and
negative values indicating the opposite. A pocket is classified as a period where a fitted squared forecast error differential (estimated using a 1-sided Kernel
with a 1-year effective sample size) is above 0 in the preceding period. Consider a particular statistic of interest, β. ∗’s represent statistical significance at
either the 10, 5, or 1% levels from a hypothesis test of β > 0. †’s represent statistical significance at either the 10, 5, or 1% levels from a hypothesis test of
β < 0.
Unrestricted + excess return forecasts All sign restrictions
Variables α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio α̂ tα̂ Sharpe Ratio
dp 2.24∗∗ 1.94 0.47 3.70∗∗ 2.15 0.51 4.01∗∗ 2.32 0.53
tbl 4.69∗∗∗ 2.74 0.62 9.10∗∗∗ 4.07 0.72 9.81∗∗∗ 4.30 0.75
tsp 2.72∗∗ 1.91 0.50 8.40∗∗∗ 3.88 0.70 7.34∗∗∗ 3.66 0.73
rvar 3.98∗∗∗ 2.89 0.63 4.67∗∗∗ 3.12 0.70 6.54∗∗∗ 3.48 0.68
mv 2.99∗∗ 1.86 0.50 7.90∗∗∗ 3.59 0.67 7.90∗∗∗ 3.59 0.67
pc 2.05∗∗ 2.08 0.51 4.46∗∗∗ 2.59 0.57 4.46∗∗∗ 2.59 0.57
comb1 5.95∗∗∗ 3.48 0.69 1.66∗ 1.58 0.47 1.73∗∗ 1.66 0.49
comb2 8.46∗∗∗ 4.12 0.77 10.25∗∗∗ 4.46 0.76 8.26∗∗∗ 3.84 0.71
comb3 3.71∗ 1.46 0.44 2.65∗ 1.50 0.43 2.96∗∗ 1.77 0.49
87

lm 4.85∗∗∗ 3.01 0.64 7.86∗∗∗ 3.44 0.65 7.86∗∗∗ 3.44 0.65

Table A.10: Economic measures of forecasting performance controlling for time-varying variance (daily benchmark specification). This
table reports 3 measures of economic significance associated with returns on a portfolio which utilizes the time-varying coefficient model forecast in-pocket
and the prevailing mean forecast out-of-pocket to allocate between the risk-free asset and the market controlling for local volatility using realized variance
(portfolio weights are limited to be between 0 and 2): the annualized estimated alpha in percentage points, the HAC t-statistic for the estimated alpha, and
the annualized Sharpe Ratio of the portfolio. We use a purely backward-looking kernel with an effective sample size of 2.5 years to compute forecasts. “pc”
is a recursively computed first principal component of the four predictor variables. “mv” is a four-variable multivariate forecast estimated using a product
Kernel. “comb1,” “comb2,” and “comb3” refer to using a simple average of the univariate forecasts. “comb1” sets an individual predictor’s forecast to
the time-varying coefficient model forecast during a pocket and to the prevailing mean otherwise. “comb2” is the same as “comb1” except that it ignores
individual predictor forecasts when that variable is not in a pocket but at least one other variable is in a pocket. “comb3” makes no distinction between
pocket and non-pocket periods and always uses the simple equal-weighted average of all four univariate models. A pocket is classified as a period where a
fitted squared forecast error differential (estimated using a 1-sided Kernel with a 1-year effective sample size) is above 0 in the preceding period. Consider
a particular statistic of interest, β. ∗’s represent statistical significance at either the 10, 5, or 1% levels from a hypothesis test of β > 0. †’s represent
statistical significance at either the 10, 5, or 1% levels from a hypothesis test of β < 0.
Unrestricted + excess return forecasts All sign restrictions
Variables None 1 bps 2 bps 10 bps None 1 bps 2 bps 10 bps None 1 bps 2 bps 10 bps
1.69∗∗ 1.66∗∗ 1.63∗∗ 1.40∗∗ 2.51∗∗∗ 2.47∗∗∗ 2.44∗∗∗ 2.18∗∗∗ 2.95∗∗∗ 2.90∗∗∗ 2.85∗∗∗ 2.46∗∗∗
dp
(2.10) (2.06) (2.03) (1.75) (2.89) (2.85) (2.82) (2.53) (3.26) (3.21) (3.16) (2.74)
3.57∗∗∗ 3.54∗∗∗ 3.52∗∗∗ 3.30∗∗∗ 6.48∗∗∗ 6.44∗∗∗ 6.41∗∗∗ 6.13∗∗∗ 6.07∗∗∗ 6.03∗∗∗ 6.01∗∗∗ 5.80∗∗∗
tbl
(4.35) (4.32) (4.28) (4.03) (5.56) (5.53) (5.50) (5.27) (5.37) (5.34) (5.31) (5.13)
3.15∗∗∗ 3.12∗∗∗ 3.10∗∗∗ 2.94∗∗∗ 5.70∗∗∗ 5.66∗∗∗ 5.63∗∗∗ 5.41∗∗∗ 5.04∗∗∗ 5.01∗∗∗ 4.98∗∗∗ 4.79∗∗∗
tsp
(4.26) (4.23) (4.21) (4.00) (4.95) (4.92) (4.90) (4.71) (4.45) (4.43) (4.41) (4.25)
2.31∗∗∗ 2.28∗∗∗ 2.26∗∗∗ 2.09∗∗∗ 2.89∗∗∗ 2.86∗∗∗ 2.84∗∗∗ 2.66∗∗∗ 2.69∗∗∗ 2.65∗∗∗ 2.61∗∗∗ 2.29∗∗∗
rvar
(3.63) (3.59) (3.56) (3.29) (3.47) (3.44) (3.41) (3.20) (3.03) (2.99) (2.94) (2.57)
2.59∗∗∗ 2.56∗∗∗ 2.52∗∗∗ 2.25∗∗∗ 4.79∗∗∗ 4.71∗∗∗ 4.64∗∗∗ 4.03∗∗∗ 4.79∗∗∗ 4.71∗∗∗ 4.64∗∗∗ 4.03∗∗∗
mv
(3.37) (3.33) (3.29) (2.96) (4.97) (4.89) (4.82) (4.21) (4.97) (4.89) (4.82) (4.21)
3.43∗∗∗ 3.39∗∗∗ 3.36∗∗∗ 3.11∗∗∗ 5.87∗∗∗ 5.83∗∗∗ 5.79∗∗∗ 5.47∗∗∗ 5.87∗∗∗ 5.83∗∗∗ 5.79∗∗∗ 5.47∗∗∗
pc
(3.97) (3.93) (3.89) (3.61) (5.01) (4.97) (4.94) (4.67) (5.01) (4.97) (4.94) (4.67)
6.38∗∗∗ 6.34∗∗∗ 6.29∗∗∗ 5.95∗∗∗ 6.72∗∗∗ 6.67∗∗∗ 6.63∗∗∗ 6.30∗∗∗ 6.69∗∗∗ 6.64∗∗∗ 6.59∗∗∗ 6.20∗∗∗
comb1
(6.11) (6.07) (6.03) (5.71) (6.71) (6.66) (6.62) (6.30) (6.46) (6.41) (6.37) (6.02)
88

6.10∗∗∗ 6.05∗∗∗ 6.00∗∗∗ 5.64∗∗∗ 8.53∗∗∗ 8.47∗∗∗ 8.41∗∗∗ 7.94∗∗∗ 8.36∗∗∗ 8.29∗∗∗ 8.22∗∗∗ 7.66∗∗∗
comb2
(5.66) (5.62) (5.58) (5.25) (6.69) (6.64) (6.60) (6.23) (6.51) (6.46) (6.40) (5.99)
0.76∗ 0.72 0.67 0.32 2.34∗∗ 2.25∗∗ 2.14∗∗ 1.31 2.72∗∗ 2.64∗∗ 2.55∗∗ 1.86∗
comb3
(1.32) (1.25) (1.17) (0.55) (1.87) (1.79) (1.71) (1.04) (2.07) (2.01) (1.95) (1.41)
−0.25† −0.26† −0.26†† −0.29†† −0.25† −0.26† −0.26†† −0.29†† −0.25† −0.26† −0.26†† −0.29††
pm
(−1.58) (−1.64) (−1.67) (−1.85) (−1.58) (−1.64) (−1.67) (−1.85) (−1.58) (−1.64) (−1.67) (−1.85)

Table A.11: Economic forecasting performance robustness under transaction costs (daily benchmark specification). This table reports
annualized estimated alphas in percentage points associated with returns on a portfolio which utilizes the time-varying coefficient model forecast in-pocket
and the prevailing mean forecast out-of-pocket to allocate between the risk-free asset and the market (portfolio weights are limited to be between 0 and 2).
We consider 4 amounts of proportional transaction costs: none, 1 bps, 2 bps, and 10 bps. Significance of the estimated alpha is assessed using a t-statistic
estimated using HAC standard errors which are reported underneath each alpha estimate. We use a purely backward-looking kernel with an effective sample
size of 2.5 years to compute forecasts. “pc” is a recursively computed first principal component of the four predictor variables. “mv” is a four-variable
multivariate forecast estimated using a product Kernel. “comb1,” “comb2,” and “comb3” refer to using a simple average of the univariate forecasts. “comb1”
sets an individual predictor’s forecast to the time-varying coefficient model forecast during a pocket and to the prevailing mean otherwise. “comb2” is the
same as “comb1” except that it ignores individual predictor forecasts when that variable is not in a pocket but at least one other variable is in a pocket.
“comb3” makes no distinction between pocket and non-pocket periods and always uses the simple equal-weighted average of all four univariate models. A
pocket is classified as a period where a fitted squared forecast error differential (estimated using a 1-sided Kernel with a 1-year effective sample size) is
above 0 in the preceding period. Consider a particular statistic of interest, β. ∗’s represent statistical significance at either the 10, 5, or 1% levels from a
hypothesis test of β > 0. †’s represent statistical significance at either the 10, 5, or 1% levels from a hypothesis test of β < 0.
Unrestricted + excess return forecasts All sign restrictions
i.i.d. Block EGARCH i.i.d. Block EGARCH i.i.d. Block EGARCH
Statistics Actual Avg. p-val Avg. p-val Avg. p-val Actual Avg. p-val Avg. p-val Avg. p-val Actual Avg. p-val Avg. p-val Avg. p-val
dp
CWF S 0.96 0.00 0.17 −0.17 0.15 −0.14 0.14 1.03 −0.03 0.15 0.02 0.14 0.12 0.20 1.13 0.06 0.14 0.00 0.15 0.19 0.19
CWIP 4.05 −0.23 0.00 −0.45 0.00 −0.24 0.00 4.14 −0.22 0.00 −0.17 0.00 −0.15 0.00 4.14 −0.14 0.00 −0.11 0.00 −0.09 0.00
CWOOP −0.09 0.02 0.53 −0.09 0.49 −0.08 0.51 −3.12 0.02 1.00 0.06 1.00 0.17 1.00 −3.01 0.09 1.00 0.04 1.00 0.21 1.00
α̂ 2.35 −0.29 0.00 −0.26 0.00 −0.15 0.00 4.09 −0.32 0.00 −0.27 0.00 −0.19 0.00 4.09 −0.26 0.00 −0.30 0.00 −0.18 0.00
tα̂ 2.51 −0.75 0.00 −0.57 0.00 −0.32 0.00 3.24 −0.50 0.00 −0.42 0.00 −0.29 0.00 3.24 −0.39 0.00 −0.42 0.00 −0.25 0.00
SR 0.54 0.42 0.08 0.42 0.09 0.46 0.22 0.69 0.41 0.00 0.41 0.00 0.47 0.03 0.69 0.41 0.00 0.42 0.00 0.47 0.03
tbl
CWF S 1.25 0.33 0.20 0.82 0.36 0.14 0.14 1.23 0.43 0.23 1.00 0.44 0.47 0.27 2.40 0.82 0.06 1.53 0.18 0.73 0.06
CWIP 3.55 −0.66 0.00 −0.13 0.00 −0.73 0.00 4.38 −0.03 0.00 0.26 0.00 −0.10 0.00 4.47 0.25 0.00 0.71 0.00 0.14 0.00
CWOOP −0.83 0.32 0.85 0.81 0.94 0.19 0.84 −1.99 0.41 0.99 0.90 0.99 0.44 0.98 −1.26 0.75 0.97 1.33 0.99 0.67 0.97
α̂ 3.75 −0.14 0.00 0.05 0.00 0.09 0.01 6.04 −0.06 0.00 0.28 0.00 0.37 0.00 6.17 0.13 0.00 0.62 0.00 0.55 0.00
tα̂ 3.34 −0.54 0.00 −0.17 0.00 −0.13 0.01 4.43 −0.33 0.00 0.08 0.00 0.12 0.00 4.40 −0.05 0.00 0.47 0.00 0.28 0.00
SR 0.76 0.52 0.13 0.53 0.14 0.60 0.22 0.86 0.51 0.05 0.52 0.06 0.62 0.13 0.86 0.52 0.06 0.53 0.07 0.61 0.12
tsp
CWF S 0.78 0.36 0.38 0.58 0.45 0.18 0.29 0.28 0.50 0.61 1.00 0.77 0.60 0.62 0.46 0.64 0.57 0.81 0.63 0.73 0.58
CWIP 2.44 −0.28 0.01 −0.56 0.01 −0.49 0.01 4.75 −0.08 0.00 0.16 0.00 −0.71 0.00 4.93 0.02 0.00 0.01 0.00 0.23 0.00
CWOOP −1.15 0.36 0.92 0.59 0.94 0.21 0.90 −1.45 0.48 0.97 0.95 0.99 0.56 0.96 −0.20 0.59 0.77 0.74 0.81 0.67 0.76
89

α̂ 2.20 −0.38 0.01 −0.22 0.01 −0.17 0.01 5.09 −0.24 0.00 −0.02 0.00 0.03 0.00 4.06 −0.12 0.00 −0.06 0.00 0.17 0.00
tα̂ 2.64 −0.77 0.00 −0.52 0.00 −0.42 0.00 3.52 −0.51 0.00 −0.22 0.00 −0.20 0.00 3.21 −0.32 0.00 −0.20 0.00 0.06 0.00
SR 0.65 0.44 0.08 0.44 0.10 0.37 0.05 0.76 0.43 0.01 0.43 0.02 0.38 0.01 0.70 0.43 0.04 0.43 0.04 0.37 0.03
rvar
CWF S 0.64 −0.09 0.26 0.27 0.45 −0.09 0.30 0.40 −0.07 0.33 0.05 0.45 0.13 0.44 1.04 −0.10 0.14 −0.06 0.14 0.12 0.19
CWIP 3.28 −0.44 0.00 −0.12 0.00 −0.23 0.00 3.18 −0.22 0.00 0.03 0.00 −0.06 0.00 3.58 −0.22 0.00 −0.07 0.00 −0.12 0.00
CWOOP 0.00 −0.05 0.48 0.24 0.63 0.07 0.49 −2.73 −0.03 1.00 −0.03 1.00 0.16 1.00 −2.10 −0.09 0.98 −0.08 0.99 0.12 0.99
α̂ 2.19 −0.27 0.00 −0.07 0.01 0.09 0.01 3.51 −0.24 0.00 0.03 0.00 0.17 0.01 3.71 −0.20 0.00 −0.02 0.00 0.15 0.01
tα̂ 2.71 −0.68 0.00 −0.18 0.00 0.00 0.01 3.25 −0.45 0.00 −0.04 0.00 0.04 0.00 3.67 −0.37 0.00 −0.11 0.00 −0.02 0.00
SR 0.55 0.46 0.25 0.45 0.24 0.49 0.35 0.65 0.46 0.11 0.45 0.10 0.49 0.18 0.66 0.45 0.05 0.44 0.08 0.49 0.16

Table A.12: OOS statistical model simulations (monthly). This table reports Monte Carlo simulation results for the empirical 1-sided Kernel
empirical findings. We consider 3 ways of bootstrapping the fitted residuals from a constant coefficient predictive regression model for excess returns and an
AR(1) model for the predictor: (i) an i.i.d. heteroskedastic bootstrap, (ii) a stationary block bootstrap where the optimal block length is chosen according
to Politis and White (2004), (iii) an EGARCH(1,1) with t-distributed shocks. All residuals are resampled jointly to preserve the cross-sectional correlation
between the innovations to the predictor and excess returns. We generate 1,000 bootstrap samples of the same sample size as is available for each predictor in
the data. A pocket is classified as a period where a fitted squared forecast error differential (estimated using a 1-sided Kernel with a 1-year effective sample
size) is above 0 in the preceding period. We report 6 statistics. The first 3 are Clark and West (2007) t-statistics relative to a prevailing mean benchmark
in the full sample, in-pocket, and out-of-pocket. The second 3 are economic statistics associated with returns on a portfolio which utilizes the time-varying
coefficient model forecast in-pocket and the prevailing mean forecast out-of-pocket to allocate between the risk-free asset and the market (portfolio weights
are limited to be between 0 and 2): the annualized estimated alpha in percentage points, the HAC t-statistic associated with that estimated alpha, and the
annualized Sharpe Ratio of the portfolio. Column 2 presents the corresponding statistics from the data for reference.
SMB HML
Statistics dp tbl tsp rvar dp tbl tsp rvar
Num pockets 25 14 12 20 20 16 11 24
Fraction of sample 0.35 0.32 0.24 0.28 0.32 0.33 0.34 0.25
Duration
Min 38 21 5 26 34 88 25 67
Mean 316.9 332.0 255.0 308.7 354.8 297.8 384.8 232.9
Max 1,588 1,217 739 830 893 598 1,353 660
Integral R2
Min -0.31 -0.08 -0.04 0.17 -0.99 0.15 0.03 0.04
Mean 5.74 5.63 3.78 6.31 4.14 3.46 5.31 3.13
Max 57.92 31.95 12.73 41.12 16.75 9.17 25.36 18.94

Table A.13: Pocket statistics (daily FF factor returns). This table reports statistics on the duration of
pockets (in days) and the integral R2 of pockets. Coefficients are estimated using a 1-sided Kernel with a 2.5 year
effective sample size and pockets are determined as periods where a fitted squared forecast error differential (relative
to a prevailing mean forecast and estimated using a 1-sided Kernel with a 1 year effective sample size) is above 0 in
the preceding period.

90
Correlation with True Risk Premium
Variables Bansal-Yaron Campbell-Cochrane Garleanu-Panageas Wachter
dp 0.14 0.99 0.99 1.00
risk-free -0.03 0.94 0.49 -1.00
rvar 0.75 0.84 0.92 0.18
True R2 (in %)
Bansal-Yaron Campbell-Cochrane Garleanu-Panageas Wachter
dp 1.28 × 10−4 0.03 6.78 × 10−3 0.04
risk-free 2.05 × 10−5 0.03 1.66 × 10−3 0.04
rvar 2.06 × 10−3 0.02 5.81 × 10−3 1.35 × 10−3
rp 3.44 × 10−3 0.03 6.92 × 10−3 0.04
Stambaugh Correlation
Bansal-Yaron Campbell-Cochrane Garleanu-Panageas Wachter
dp -0.81 -1.00 -0.99 -0.82
risk-free 0.80 -0.97 -0.44 0.82
rvar 0.03 0.03 0.02 -0.33
rp -0.05 -0.97 -0.95 -0.82
Bansal-Yaron Campbell-Cochrane Garleanu-Panageas Wachter
First Order Autocorrelation (annualized)
dp 0.78 0.89 0.97 0.93
risk-free 0.78 0.90 0.94 0.93
rvar 0.17 0.53 0.53 0.02
rp 0.85 0.88 0.97 0.93

Table A.14: Asset pricing model statistics. This table reports reports various correlations and autocorrelations
associated with financial predictors across the 4 different asset pricing models we consider in the paper. The first panel
reports the estimated correlation between each model’s true risk premium rp and the model’s dividend-price ratio dp,
risk-free rate r, and 60-day realized variance rvar. The second panel reports the true R2 from predictive regressions
of excess returns on dp, r, rvar, and rp. The third panel reports the estimated correlation between innovations to the
excess return predictive regression and an AR(1) model estimated for dp, r, rvar, and rp. The fourth panel reports
the annualized AR(1) coefficients for dp, r, rvar, and rp.

91
Bansal-Yaron Campbell-Cochrane Garleanu-Panageas Wachter Wachter (no disasters)
Stats Sample Avg. Std. err. p-val Avg. Std. err. p-val Avg. Std. err. p-val Avg. Std. err. p-val Avg. Std. err. p-val
dp
CWF S 0.40 −0.03 0.99 0.76 0.38 1.00 0.88 0.16 0.97 0.83 0.74 1.04 0.92 0.97 1.09 0.94
CWIP 3.79 −0.03 1.00 0.00 0.11 0.98 0.00 0.06 0.98 0.00 0.43 1.04 0.01 0.59 1.08 0.02
CWOOP −1.94 −0.05 1.01 0.92 0.37 1.00 0.97 0.12 0.97 0.95 0.62 1.03 0.98 0.75 1.03 0.99
α̂ 2.51 −0.40 1.83 0.12 0.05 1.06 0.05 −0.02 0.98 0.04 0.28 2.15 0.23 0.34 1.44 0.17
tα̂ 2.89 −0.22 1.00 0.01 0.04 1.01 0.02 −0.02 0.99 0.02 0.17 1.08 0.03 0.22 1.01 0.03
SR 0.54 0.44 0.13 0.40 0.47 0.07 0.53 0.33 0.11 0.08 0.46 0.12 0.46 0.58 0.10 0.90
r
CWF S 1.98 −0.03 0.99 0.25 0.39 1.00 0.39 0.12 0.95 0.27 0.74 1.04 0.51 0.98 1.09 0.60
CWIP 4.75 −0.05 1.02 0.00 0.13 0.98 0.00 0.05 0.95 0.00 0.43 1.04 0.00 0.60 1.07 0.01
CWOOP −1.33 −0.04 1.01 0.92 0.37 0.99 0.97 0.09 0.97 0.95 0.62 1.03 0.98 0.75 1.03 0.99
α̂ 6.48 −0.39 1.83 0.01 0.06 1.06 0.00 −0.03 0.95 0.00 0.28 2.15 0.06 0.34 1.44 0.02
tα̂ 5.56 −0.22 1.00 0.00 0.04 1.01 0.00 −0.02 0.99 0.00 0.17 1.08 0.00 0.22 1.01 0.00
92

SR 0.94 0.44 0.13 0.01 0.47 0.07 0.00 0.33 0.11 0.00 0.46 0.12 0.00 0.58 0.10 0.03
rvar
CWF S −0.79 −0.02 1.02 0.93 0.02 0.99 0.93 0.04 0.97 0.94 0.33 1.00 0.97 0.53 1.10 0.97
CWIP 3.93 −0.04 1.04 0.00 −0.03 1.03 0.00 −0.01 0.98 0.00 0.18 0.99 0.01 0.29 1.07 0.01
CWOOP −1.07 −0.02 1.00 0.96 0.03 0.99 0.96 0.04 0.96 0.97 0.33 1.02 0.98 0.46 1.02 0.99
α̂ 2.89 −0.39 1.84 0.07 −0.06 1.04 0.01 −0.04 0.96 0.01 0.07 1.72 0.08 0.19 1.41 0.07
tα̂ 3.47 −0.22 1.00 0.00 0.04 1.01 0.00 −0.02 0.99 0.00 0.17 1.08 0.00 0.22 1.01 0.00
SR 0.71 0.44 0.13 0.03 0.47 0.07 0.01 0.33 0.11 0.00 0.45 0.12 0.03 0.58 0.10 0.12

Table A.15: OOS asset pricing model simulations (+ excess return forecasts). This table reports Monte Carlo simulation results of our 1-sided
Kernel estimation applied to data simulated from 4 different asset pricing models (this includes two specifications of Wachter’s rare disasters model, one of
which omits data from disaster episodes). We report 6 statistics. The first 3 are Clark and West (2007) t-statistics relative to a prevailing mean benchmark
in the full sample, in-pocket, and out-of-pocket. The second 3 are economic statistics associated with returns on a portfolio which utilizes the time-varying
coefficient model forecast in-pocket and the prevailing mean forecast out-of-pocket to allocate between the risk-free asset and the market (portfolio weights
are limited to be between 0 and 2): the annualized estimated alpha in percentage points, the HAC t-statistic associated with that alpha, and the annualized
Sharpe Ratio of the portfolio. Column 2 presents the corresponding statistics from the data for reference. The time-varying coefficient model forecast is
restricted to be greater than or equal to 0.
Bansal-Yaron Campbell-Cochrane Garleanu-Panageas Wachter Wachter (no disasters)
Stats Sample Avg. Std. err. p-val Avg. Std. err. p-val Avg. Std. err. p-val Avg. Std. err. p-val Avg. Std. err. p-val
dp
CWF S 0.68 0.00 1.00 0.78 0.39 0.99 0.87 0.20 0.96 0.83 0.77 1.05 0.93 1.02 1.09 0.95
CWIP 4.03 0.03 1.01 0.00 0.13 1.00 0.00 0.11 1.00 0.00 0.46 1.04 0.01 0.64 1.08 0.02
CWOOP −1.84 −0.04 1.02 0.94 0.36 1.00 0.97 0.15 0.97 0.96 0.64 1.04 0.99 0.78 1.03 0.99
α̂ 2.95 −0.40 1.87 0.14 0.04 1.05 0.06 −0.03 0.98 0.04 0.24 2.16 0.23 0.34 1.43 0.16
tα̂ 3.26 −0.21 1.00 0.01 0.03 1.00 0.01 −0.02 0.98 0.01 0.15 1.07 0.03 0.22 1.01 0.03
SR 0.57 0.44 0.13 0.40 0.47 0.07 0.53 0.33 0.11 0.08 0.46 0.12 0.45 0.58 0.10 0.89
r
CWF S 2.03 −0.01 1.00 0.23 0.13 0.66 0.18 −0.20 0.98 0.18 0.80 1.03 0.54 1.01 1.08 0.62
CWIP 4.69 −0.00 1.01 0.00 −0.08 0.83 0.00 −0.12 1.00 0.00 0.46 1.03 0.00 0.62 1.07 0.01
CWOOP −1.21 −0.03 1.02 0.93 0.25 0.86 0.98 −0.17 0.96 0.93 0.68 1.02 0.99 0.78 1.02 0.99
α̂ 6.07 −0.41 1.86 0.01 −0.01 0.41 0.00 −0.22 0.88 0.00 0.34 2.08 0.06 0.35 1.43 0.02
tα̂ 5.37 −0.21 1.00 0.00 0.03 1.00 0.00 −0.02 0.98 0.00 0.15 1.07 0.00 0.22 1.01 0.00
93

SR 0.92 0.44 0.13 0.01 0.48 0.07 0.00 0.33 0.11 0.00 0.46 0.12 0.00 0.58 0.10 0.03
rvar
CWF S −0.44 0.06 0.99 0.94 0.62 0.90 0.98 0.08 1.00 0.95 0.47 0.96 0.99 0.80 0.97 0.99
CWIP 3.10 −0.01 0.96 0.00 0.17 0.97 0.00 −0.04 0.98 0.00 0.24 0.97 0.00 0.40 1.06 0.01
CWOOP −0.97 0.05 0.98 0.97 0.59 0.90 0.99 0.07 1.01 0.97 0.45 0.97 0.99 0.66 0.95 0.99
α̂ 2.69 −0.08 1.74 0.09 0.31 0.98 0.02 0.05 0.98 0.01 0.10 1.77 0.09 0.46 1.31 0.08
tα̂ 3.03 −0.21 1.00 0.00 0.03 1.00 0.00 −0.02 0.98 0.00 0.15 1.07 0.00 0.22 1.01 0.00
SR 0.54 0.44 0.13 0.03 0.47 0.07 0.01 0.33 0.11 0.00 0.45 0.12 0.03 0.58 0.10 0.13

Table A.16: OOS asset pricing model simulations (all sign restrictions). This table reports Monte Carlo simulation results of our 1-sided Kernel
estimation applied to data simulated from 4 different asset pricing models (this includes two specifications of Wachter’s rare disasters model, one of which
omits data from disaster episodes). We report 6 statistics. The first 3 are Clark and West (2007) t-statistics relative to a prevailing mean benchmark in
the full sample, in-pocket, and out-of-pocket. The second 3 are economic statistics associated with returns on a portfolio which utilizes the time-varying
coefficient model forecast in-pocket and the prevailing mean forecast out-of-pocket to allocate between the risk-free asset and the market (portfolio weights
are limited to be between 0 and 2): the annualized estimated alpha in percentage points, the HAC t-statistic associated with that alpha, and the annualized
Sharpe Ratio of the portfolio. Column 2 presents the corresponding statistics from the data for reference. The time-varying coefficient model forecast is
restricted to be greater than or equal to 0 and the estimated coefficients are restricted to be of a particular sign in accordance with economic theory: + for
dp, - for r, and + for rvar.
Panel A: Parameters
Parameter Notation Baseline Baseline (λ = 0) RE recalibrated
Sticky expectations parameter λ 0.98107 0.00000 0.00000
Persistence of cash flow state ρ252
cf 0.66854 0.66854 0.66854
Persistence of discount rate state ρ252
dr 0.62264 0.62264 0.66171
Persistence of time-preference state ρ252
tp 0.94772 0.94772 0.96139
Loading of risk-free rate on cash flows β rf,cf -0.01707 -0.01707 -0.19009
Loading of risk-free rate on discount rates β rf,dr -0.27597 -0.27597 -0.21651
Volatility of cash flow state σ zcf 0.08584 0.08584 0.04000
Volatility of discount rate state σ zdr 0.06263 0.06263 0.06461
Volatility of time-preference state σ ztp 0.02668 0.02668 0.01903
Volatility of dividend growth σ ∆d 0.07999 0.07999 0.06042
Volatility of subj. expected dividend growth σ F [∆d] 0.07638 0.08584 0.04000
Panel B: Moments
Moments Data Baseline Baseline (λ = 0) RE recalibrated
Volatility of log returns 0.15664 0.15706 0.23823 0.17180
Volatility of log dp ratio 0.37688 0.36889 0.38010 0.30874
Volatility of risk-free 0.03198 0.03182 0.03182 0.02482
AC1 of log pd ratio 0.89277 0.93531 0.83991 0.87304
AC1 of log rf ratio 0.89873 0.83734 0.83670 0.82521
Correlation(pd,rf) -0.60772 -0.59254 -0.57996 -0.56010
OLS coef. of excess returns on log pd -0.04317 -0.01638 -0.01543 -0.03061
OLS coef. of excess returns on log rf -0.00616 -0.00424 -0.00424 -0.00582
Stambaugh correlation of log dp -0.86622 -0.85846 -0.94193 -0.93608
Stambaugh correlation of log rf -0.04100 0.19509 0.07516 0.09248

Table A.17: Calibrated parameters and moments for sticky expectations model. This table
reports the calibrated parameters and analytic moments of the Sticky Expectations VAR Model. All param-
eters and moments are reported in annualized units. Panel A reports calibrated parameters and Panel B
reports the implied moments of interest. The “Data” column in Panel B lists the annualized empirical targets
used for calibration. The three rightmost columns refer to three separate calibrations. In order, “Baseline”
refers to the standard calibration with sticky expectations, “Baseline (λ = 0)” refers to the “Baseline” cali-
bration but with rational expectations (i.e., λ = 0), and “RE Recalibrated” refers to a recalibration of the
rational expectations model to match the target moments. Calibrated parameters are chosen by minimizing
the weighted sum of squared deviations of analytic moments from empirical targets.

94
10 -4 tbl
15

10

0
1970 1975 1980 1985 1990 1995 2000 2005 2010 2015

10 -3 tbl
2

-2
95

-4

-6

-8

-10
1970 1975 1980 1985 1990 1995 2000 2005 2010 2015

Figure A.1: Cumsum of squared forecast error differentials for the tbl model. Each panel presents the cumulated sum of squared
forecast errors between a time-varying coefficient model with a 2.5 effective sample size and the prevailing mean model. Areas shaded in gray
are pocket periods identified in real-time when a fitted squared forecast error differential estimated using a 1-year effective sample size is greater
than 0 in the preceding period. The top row shows a forecasting rule which uses the time-varying coefficient model in pockets and the prevailing
mean model out of pockets. The bottom row shows a foreacsting rule which uses the prevailing mean model in pockets and the time-varying
coefficient model out of pockets.
3

dp
-1 tbl
tsp
rvar
-2
1 1.5 2 2.5 3 3.5 4

Figure A.2: Clark-West statistics by quartile of fitted squared forecast error differential. For
each univariate model’s forecasts, we sort them into four bins according to the quartiles of our SED
[ t measure.
For each of these quartiles we report the estimated Clark-West (2007) statistic. The dashed line corresponds
to a rough 95% cutoff level of 2 for the t-statistics.

96
97

Figure A.3: Local return predictability (HML). Each panel plots 1-sided non-parametric kernel estimates of the local SED [ t (estimated
using a 1-year effective sample size) from a regression of daily returns on the Fama-French HML portfolio on each of the four predictor variables
using an effective sample size of 2.5 years. The shaded areas represent periods when SED[ t > 0, with areas colored in red representing pockets
that have less than a 10% chance of being spurious, areas colored in blue representing pockets that have more than a 10% chance of being
spurious. The sampling distributions used to determine spuriousness comes from an EGARCH(1,1) residiual bootstrap design.
98

Figure A.4: Local return predictability (SMB). Each panel plots 1-sided non-parametric kernel estimates of the local SED [ t (estimated
using a 1-year effective sample size) from a regression of daily returns on the Fama-French SMB portfolio on each of the four predictor variables
using an effective sample size of 2.5 years. The shaded areas represent periods when SED[ t > 0, with areas colored in red representing pockets
that have less than a 10% chance of being spurious, areas colored in blue representing pockets that have more than a 10% chance of being
spurious. The sampling distributions used to determine spuriousness comes from an EGARCH(1,1) residiual bootstrap design.
Figure A.5: Analytic impulse response functions to one-quarter discount-rate shock one-
quarter cash-flow shocks. This figure displays the impulse response function of four variables: log risk-free
rate, log subjective risk premium, log dividend-price ratio, and log returns. The left column reports results
for a one-quarter discount-rate shock and the right column reports results for a one-quarter cash-flow shock.
The impulse response functions are calculated analytically according to baseline calibrated parameters. The
blue line refers to the case when expectations are sticky (λ 6= 0) and the orange line refers to the case when
expectations are rational (λ = 0).

Figure A.6: Cross-correlations between pocket indicator and measures of return predictability
in sticky expectations model Figure depicts the cross correlations from simulated data from the sticky
expectations between a pocket indicator, pt , and two measures of the strength of predictability coming from
the sticky expectations channel |ϑt−h | and the subjective risk premium |zdr,t−h |, respectively.

99
1
dp tbl tsp rvar pc mv comb1 comb2 comb3
0.8

0.6

0.4

0.2
Correlation

-0.2

-0.4

-0.6

-0.8

-1
gy ue ip
Coibion-Gorodnichenko Variables

Figure A.7: Correlation of Coibion-Gorodnichenko forecast errors with excess return forecasts
(one quarter lead). This table reports correlations between forecast errors of three macroeconomic vari-
ables from the Survey of Professional Forecasters (SPF) with excess return forecasts from our time-varying
coefficient models. The three sets of bar graphs correspond to forecast errors for real GDP growth (gy), the
unemployment rate (ue), and real industrial production growth (ip). The height of the nine colored bars
represent correlations of those forecast errors with the labeled excess return forecasts from our time-varying
predictor models. Each bar is bracketed by a 95% confidence interval computed using HAC standard errors.
Since the SPF respondents send in their forecasts in the middle of each quarter, we lead the SPF forecasts
by one quarter to be conservative about information sets.

100

You might also like