The Explanatory Power of Sectors and Style

Factor analysis is a popular and effective technique that explains and forecasts security returns. The factor models prevalent in academic circles (Fama-FrenchCarhart) tend to rely heavily on the size and value style factors. Meanwhile, effective industry models often attribute risk to sector and industry factors before style. Which approach is more effective? Though claims that style explains stock returns are common, they usually lack evidence – there is a paucity of research that compares the explanatory power of sectors and style.

This paper provides the missing data and analyzes the explanatory power of sectors and style for U.S. stocks. We find that, after controlling for the market exposure, sectors are slightly more effective than size and approximately four times more effective than value in explaining monthly returns.

Measuring the Explanatory Power of Sectors and Style for U.S. Stock Returns

  • We analyze the monthly returns of U.S. stocks over the 15-year interval 4/30/2004-4/30-2019. We start with approximately 6,000 stocks that pass minimum market capitalization and liquidity thresholds – a universe similar to the Russell 3000 Index.
  • For each month in the historical interval and each stock in the sample, we estimate the stock’s market beta.
  • We calculate (out-of-sample) market residuals (alphas) for a given month using the prior month’s market beta for each stock.
  • We restrict the analysis to stocks with at least five years of defined market residuals to have a significant sample. This final sample comprises approximately 3,600 stocks.
  • We construct sector, size, and value factors as follows:
    • The sector factors are cap-weighted portfolios of market residuals for the nine sectors equivalent to the top-level GICS sectors.
    • The size factor is long the cap-weighted portfolio of stocks in the top 25% and short stocks in the bottom 25%, as ranked by market capitalization.
    • The value factor is long the cap-weighted portfolio of stocks in the top 25% and short stocks in the bottom 25%, as ranked by Book/Price.

The results below are insensitive to the specific factor definition and hold across different sector and style portfolios.

  • For sector, size, and value factors, we regress stocks’ market residuals on the corresponding factor returns and measure each regression’s R². This measures the theoretical (in-sample) explanatory power of sectors and style.
  • We estimate factor exposures by robust regression. We fit models with iterated re-weighted least squares (IRLS). Observations are exponentially weighted with a half-life of approximately 36-months.

A Comparison of the Explanatory Power of Sectors and Style for U.S. Stock Returns

The chart below plots the distributions of  from the regressions of U.S. stocks’ market residuals against the sector, size, and value factors. The x-axis plots the intervals of regression R², and the y-axis plots the number of stocks in each interval. Since the distributions approximately follow a power law, we use a log y-axis:

Chart of the explanatory power of sectors, size, and value -- the distributions of the R² (the coefficient of determination) from regressions U.S. stocks’ Market residuals (alphas) on the Sector, Size, and Value Factors
U.S. Stocks: The Distributions of R² for the Regressions of U.S. Stock Market Residuals (Alphas) on the Sector, Size, and Value Factors

The distributions above illustrate the higher explanatory power of sectors compared to style: Whereas sector factors explain over 25% of the variance in market residuals for hundreds of stocks, style factors do so for only a small handful.

The chart below plots the mean R² values:

Chart of the explanatory power of market, sectors, size, and value -- mean R² (the mean coefficient of determination) from regressions U.S. stocks’ returns and Market residuals (alphas) on the Market, Sector, Size, and Value Factors
U.S. Stocks: Mean R² for the Regressions of U.S. Stock Returns and Market Residuals on the Various Factors

Reasons for Sectors’ Higher Explanatory Power

This higher explanatory power of sectors is unsurprising, given that commentary on style performance usually relies on sector factors: “Value underperformed because oil price crashed, and oil producer stocks, which are cheap, suffered.” On the other hand, even the most ideologically pure believer in the primacy of style would not make the statement: “Oil price crashed because oil producer stocks are cheap and value has recently underperformed.” Whereas sector factors can generally explain the reasons for style factor returns, style factors cannot explain the reasons for sector factor returns.

Since style factors capture systematic risk less effectively, portfolio construction from style building blocks can lead to significant unintended exposures. Studies of common smart beta strategies do indeed find such risks and significant market timing. On the other hand, sector and industry exposures offer superior control of systematic risk and more effective building blocks for portfolio construction.

In the sections that follow, we share the statistics on the explanatory power of the various factors.

The Explanatory Power of Market for U.S. Stock Returns

The chart below plots the distribution of R² from the regressions of U.S. stocks’ returns against the Market factor. This step of the analysis allows us to control for market risk and to analyze the explanatory power specific to the other factors:

Chart of the explanatory power of Market -- the distributions of the R² (the coefficient of determination) from regressions U.S. stocks’ returns on the Market Factor
U.S. Stocks: The Distribution of R² for the Regressions of U.S. Stock Returns on the Market Factor
Min.    1st Qu.  Median    Mean    3rd Qu.    Max.
0.0000  0.1117   0.1770    0.1914  0.2563     0.6525

Market explains approximately 20% of the (in-sample) variance of stock returns. The tests below analyze the out-of-sample (investable) market residuals that this step produces.

The Explanatory Power of Sectors for U.S. Stock Returns

The chart below plots the distribution of R² from the regressions of U.S. stocks’ market residuals against sector factors:

Chart of the explanatory power of sectors -- the distributions of the R² (the coefficient of determination) from regressions U.S. stocks’ Market residuals (alphas) on the Sector Factors
U.S. Stocks: The Distribution of R² for the Regressions of U.S. Stock Market Residuals on Sector Factors
Min.    1st Qu.  Median    Mean    3rd Qu.    Max.
0.0000  0.0070   0.0267    0.0618  0.0767     0.6166

For most stocks, sectors explain 2.7% or more of return variance, after controlling for market risk. The average effectiveness is statistically much higher, since sectors explain a large fraction of return variance for some stocks (e.g., Energy sector for Exxon Mobil).

The Explanatory Power of Size for U.S. Stock Returns

The following chart plots the distribution of R² from the regressions of U.S. stocks’ market residuals against the Size factor:

Chart of the explanatory power of size -- the distributions of the R² (the coefficient of determination) from regressions U.S. stocks’ Market residuals (alphas) on the Size Factor
U.S. Stocks: The Distribution of R² for the Regressions of U.S. Stock Market Residuals on the Size Factor
Min.    1st Qu.  Median    Mean    3rd Qu.    Max.
0.0000  0.0089   0.0290    0.0415  0.0599     0.3366

These results support the popularity of the size factor in academic research. For most stocks, the size factor explains 2.9% or more of return variance, after controlling for market risk. Nevertheless, the average explanatory power of sectors is approximately 1.5 times greater.

The Explanatory Power of Value for U.S. Stock Returns

The following chart plots the distribution of R² from the regressions of U.S. stocks’ market residuals against the Value factor:

Chart of the explanatory power of value -- the distributions of the R² (the coefficient of determination) from regressions U.S. stocks’ Market residuals (alphas) on the Value Factor
U.S. Stocks: The Distribution of R² for the Regressions of U.S. Stock Market Residuals on the Value Factor
Min.    1st Qu.  Median    Mean    3rd Qu.    Max.
0.0000  0.0020   0.0092    0.0202  0.0251     0.3350

Contrary to its vogue in academic research, the explanatory power of Value is low, even in these in-sample results. The Value factor explains less than 1% of return variance, after controlling for market risk. Even for the 25% of stocks where the value factor has the greatest explanatory power, it only explains about 2.5% of return variance.

Notes on the Quantitative Methodology

This study controls for market risk before analyzing the explanatory power of sector and style factors. This two-step approach is necessary to avoid the multicollinearity problems that plague academic research into style factors. Since small and large companies typically have different market betas, and since cheap and expensive companies also typically have different market betas, the Fama-French and Carhart factors are collinear. Though this multicollinearity does not necessarily undermine the overall model, it does render individual factor betas and associated statistics meaningless.

We measured the in-sample explanatory power of various factors, similarly to typical academic research on the subject. These results are theoretical and do not represent practically attainable investment outcomes – they are the upper bound for out-of-sample explanatory power: This approach calculates factor exposures and residuals using a regression of stock returns on one or more factors. For instance, the regression of AAPL in the 4-factor Carhart model for 2010-2015 produces betas and alphas that are un-investable. To realize this alpha, one would need to know 2014 returns in order to effectively hedge AAPL in 2010. We use a similar approach in this study, and our analysis suffers from the same limitations – the results are in-sample.

Conclusions

  • Academic analysis favors factors with less explanatory power than industry’s real-world modeling.
  • The explanatory power of sectors is slightly higher than that of size, and approximately four times greater than that of value/growth.
  • Portfolio construction and manager allocation with sector, rather than style building blocks, provide greater control over systematic risk.
  • Risk models that seek to capture effectively systematic risk should account for sector or industry risk before style risk.
  • Sectors’ higher explanatory power holds across different industry classifications and style factor definitions.

.

.

.

.

.

.

.

The information herein is not represented or warranted to be accurate, correct, complete or timely. Past performance is no guarantee of future results. Copyright © 2012-2019, AlphaBetaWorks, a division of Alpha Beta Analytics, LLC. All rights reserved. Content may not be republished without express written consent.

The Predictive Power of Active Share

Active Share is a popular metric that purports to measure portfolio activity. Though Active Share’s fragility and ease of manipulation are increasingly well-understood, there has been no research on its predictive power.

This paper quantifies the predictive power of Active Share and finds that, though Active Share is a statistically significant predictor of the performance difference between portfolio and benchmark, it is a weak one, explaining only approximately 5% of the variation in active management across U.S. equity mutual funds. The predictive power of Active Share is a small fraction of that achieved with robust and predictive equity risk models.

The Breakdown of Active Share

Active Share — the absolute percentage difference between portfolio and benchmark holdings – is a common metric of fund activity. The flaws of this measure are evident from simple examples:

  • If fund with S&P 500 benchmark buys SPXL (S&P 500 Bull 3x ETF), becoming more similar to the benchmark, its Active Share increases.
  • If a fund with S&P 500 benchmark indexes Russell 2000, this passive strategy has 100% Active Share.
  • If a fund F1 differs from the benchmark B in a single 5% position P1 with 20% residual (idiosyncratic, stock-specific) volatility, and F2 differs B in a 10% position P2 with 5% residual volatility F2 has a higher active share, yet is less active.
  • If a fund holds a secondary listing of a benchmark holding, its Active Share increases.

In light of the above flaws, evidence that Active Share funds that outperform may merely index higher-risk benchmarks is unsurprising.

Measuring Active Management

A common defense is that the above and similar examples are pathological or esoteric, unrepresentative of the actual portfolios. Such defense asserts that Active Share measures active management of real-world portfolios.

Astonishingly, we have not seen a single paper assessing whether Active Share has any effectiveness in doing what it is supposed to do – identify which funds are more and which are less active. This paper provides such an assessment.

We consider two metrics of fund activity: Tracking Error and monthly active returns (measured as Mean Absolute Difference between portfolio and benchmark returns). Both of these metrics measure how different the portfolios are in practice. Whether Active Share has value for measuring fund activity depends on whether it can differentiate among more and less active funds.

The study dataset comprises portfolio histories of approximately three thousand U.S. equity mutual funds that are analyzable from regulatory filings. The funds had 2-10 years of history. Our study uses the bootstrapping statistical technique – we select 10,000 samples and perform the following steps for each sample:

  • Select a random fund F and a random date D.
  • Calculate Active Share of F to the S&P 500 ETF (SPY) at D.
  • Keep samples with Active Share between 0 and 0.75 indicating that SPY may be an appropriate benchmark. This step excludes small- and mid-capitalization funds that share no holdings with SPY and would all collapse into a single point with the Active Share of 100, impairing statistical analysis.
  • Measure the activity of F for the following 12 months (period D to D + 12 months). We determine how active a fund is relative to a benchmark by quantifying how similarly to the benchmark it performs.

After the above steps, we have 10,000 observations of fund activity as estimated by Active Share and actual subsequent fund activity.

The Predictive Power of Active Share for Large-Cap U.S. Equity Mutual Funds

The following results quantify the predictive power of active share for differentiating between more and less active U.S. equity mutual funds. For perspective, we also include results on the predictive power of robust equity risk models. These results illustrate the relative weakness of Active Share as a measure of fund activity. They also indicate that, far from mitigating legal risk by reliance of a best practice, the use of Active Share to detect closet indexing may instead create legal risk.

The Predictive Power of Active Share for Forecasting Future Tracking Error

Active Share is a statistically significant metric of fund activity, but a very weak one, predicting only about 5% of the variation in tracking error across mutual funds:

         

         U.S. Equity Mutual Fund Portfolios: The Predictive Power of Active Share for Forecasting Future Tracking Error

Residual standard error: 1.702 on 9998 degrees of freedom
Multiple R-squared:  0.05163,   Adjusted R-squared:  0.05154 
F-statistic: 544.3 on 1 and 9998 DF,  p-value: < 2.2e-16

The Predictive Power of Active Share for Forecasting Future Active Returns

Active Share also predicts only about 5% of the variation in monthly active returns across mutual funds:

            

U.S. Equity Mutual Fund Portfolios: The Predictive Power of Active Share for Forecasting Future Active Return

Residual standard error: 0.3986 on 9998 degrees of freedom
Multiple R-squared:  0.04999,   Adjusted R-squared:  0.04989
F-statistic: 526.1 on 1 and 9998 DF,  p-value: < 2.2e-16

.

The above results make the generous assumption that all relative returns are due to active management. In fact, much relative performance is attributable to passive differences between a portfolio and a benchmark. This complexity will be captured in our follow-up research.

The Predictive Power of Robust Equity Risk Models

To put the predictive power of Active Share into perspective, we compare it to the predictive power of tracking error as estimated by robust and predictive equity risk models. Instead of Active Share, we use our default Statistical U.S. Equity Risk Model to forecast tracking error of a fund F at D.

The Predictive Power of Equity Risk Models for Forecasting Future Tracking Error

The equity risk model predicts approximately 38% of the variation in tracking error across mutual funds:

              

U.S. Equity Mutual Fund Portfolios: The Predictive Power of Robust Equity Risk Models for Forecasting Future Tracking Error

Residual standard error: 1.379 on 9998 degrees of freedom
Multiple R-squared: 0.3776, Adjusted R-squared: 0.3776
F-statistic: 6067 on 1 and 9998 DF, p-value: < 2.2e-16

The Predictive Power of Equity Risk Models for Forecasting Future Active Returns

The equity risk model predicts approximately 44% of the variation in monthly active returns across mutual funds:

   
U.S. Equity Mutual Fund Portfolios: The Predictive Power of Robust Equity Risk Models for Forecasting Future Active Return

Residual standard error: 0.3068 on 9998 degrees of freedom
Multiple R-squared:  0.4375,    Adjusted R-squared:  0.4374
F-statistic:  7776 on 1 and 9998 DF,  p-value: < 2.2e-16

Conclusions

  • Active Share is a statistically significant metric of active management (there is a relationship between Active Share and how active a fund is relative to a given benchmark), yet the predictive power of Active Share is very weak.

  • Active Share predicts only about 5% of the variation in tracking error and active returns across U.S. equity mutual funds.

  • A robust and predictive equity risk model is approximately 7 to 9 times more effective than Active Share, predicting approximately 40% of the variation in tracking error and active returns across U.S. equity mutual funds.

  • In the following articles, we will put the above predictive statistics into context and quantify how likely Active Share is to identify closet indexers.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

The information herein is not represented or warranted to be accurate, correct, complete or timely.
Past performance is no guarantee of future results.
Copyright © 2012-2019,  Alpha Beta Analytics, LLC. All rights reserved.
Content may not be republished without express written consent.

Equity Analytics

Passively-available beta differences with a benchmark are a byproduct, typically unintentional, of any stock-selection process.  Since consistent passive differences, once properly identified, can be freely obtained or offset, they are not part of active contribution.

 Isolating  active performance from the impact of consistent passive differences offers tremendous oversight advantages. 

Unfortunately, current methodologies all fail to properly define passive exposures. As a result current analytics fail to predict future performance, and analytics are only valid to the extent they are predictive.

If risk analytics are valid, they’re predictive.  Analytics that fail to predict future performance are invalid.

We’re offering highly predictive* statistical risk models built to isolate active contributions from passively-available exposures — revealing security-selection skill that persists, true active risk, opportunities to  reduce relative risk without sacrificing active risk, as well as to offset any unintentional bets that may endanger performance.   

 

 

* Over 0.96 median correlation between predicted and subsequent realized returns

FAQs

What’s wrong with FactSet’s attributions and risk estimates and how can you prove it?

Brinson/Active Share fail to consider differences in individual security market and sector Betas, as a consequence, they fail to properly separate active from passive performance. The problem can be demonstrated by observing non-zero correlations between security selection and passive return components.

Security-selection return is defined as a residual (that portion of incremental return unexplained by various passive market exposures) so by definition it is uncorrelated with passive benchmarks or exposures. To the extent security-selection return and passive return calculated by a given system are found to be correlated, the system has failed to properly isolate active contribution and will fail to detect skill and active risk.

A simple example may be how Brinson attribution will deal with a leveraged passive ETF. If SPY returns +10% and a 2x leveraged SPY ETF returns +20%, the Brinson approach will attribute +10% to security selection, instead of the passive market effect.

Returns Based Style analysis estimates average exposure over time and fails for active portfolios in which exposures change through time. The problem with RBS can be demonstrated by comparing current predictions with future performance.

In fact, risk and skill analytics are only valid to the extent they are predictive.  Skilled managers detected by Brinson attributions should tend to outperform in subsequent years, and risks estimated by RBS should reasonably accurately predict future return.

In our tests, we’ve found neither of FactSet’s approaches to be predictive.

 

Why is your approach better and how can you prove it?

Robust estimates of point-in-time betas overcome limitations of Brinson/RBS and result in predictive attributions and risk exposures. This is demonstrated both with the persistence of security selection skill and with the correlation of returns predicted by current exposures with future realized returns.

 

If RBS fails to capture changing exposures, why not just look at rolling regressions?

Regressions provide an estimate of average exposure over a period rather than exposure as of a point-in-time. To the extent exposures change significantly during the period (as with active portfolios) averages may be a poor approximation for any given point-in-time exposure. Both attributions and portfolio risk estimates require point-in-time exposures.

Rolling regressions would simply increase the amount of bad data, for example, consider a manager who ran a 1.5x market beta portfolio last month, but got worried this month, and switched to a 0.5x market beta portfolio. A rolling regression will produce garbage output in such cases.

 

Are your tests time period dependent?  Why should we be confident in your data?

You don’t need to have any confidence in our data. By testing a replicating passive portfolio for a manager, you can see the predictive effectiveness of the models for yourself.

In general, you are right to be skeptical and should only trust risk models and performance analytics that you can test out-of-sample yourself, such as with the replicating portfolio tests that we advise.

 

Why can’t we do the same thing you’re doing ourselves?

Both the Brinson and the RBS analysis in FactSet can be readily replicated in Excel.

Statistical equity risk models, on the other hand, are mathematically complex and require properly translating individual security exposures to portfolio exposures and variances.

Some analytics define risk as of a point in time as the volatility of a portfolio given its holdings and their recent returns until that point in time. Is this estimate accurate?

This is a reasonable approximation of current risk, and by extension VaR, however it does not identify any of the market factors that contribute to current risk  – and knowledge of underlying exposures is critical.

Three main reasons:

  • If you don’t know what the sources of risk are, you don’t know what can be done to make changes and mitigate any problems.  For example, If two portfolios have +10 or -10% statistical exposure to Emerging Markets, the tracking error and VaR will be the same, but the actual risks (and remedies)are the opposite of each other. It’s not terribly helpful to know what current risk is if you don’t know what measures can be taken if that risk is too high or too low.

 

  • Current risk, without knowledge of market exposures, cannot be used for stress testing over different market regimes or historical periods.

 

  • Portfolios with equal risk defined by recent history may have very different underlying exposures, which coincidentally have had the same recent volatility, and those exposures may have completely different long-term risk profiles which would remain hidden with the more simplistic approach.  Hidden exposures to market factors that have had uncharacteristically low recent volatility may seriously underestimate true current risk.  Modeling tail risk based on standard deviations, without quantifying sources of volatility, risks missing the forest for the trees.

 

Do you run your factor-oriented model on holdings, observed historical returns, or both?

Our factor models are built for individual stocks by analyzing observed historical returns. The regression of stock returns on factors calculates stocks’ factor exposures.

These individual stocks’ factor exposures are then aggregated for a portfolio using holdings data to estimate portfolio factor exposures over time.

In summary, the analysis uses both holdings and returns: returns of individual stocks to estimate the factor exposures of stocks and portfolio holdings of individual stocks to estimate the factor exposures of portfolios.

 

Your white paper states that your approach has 0.96 median correlation between predicted ex-ante and reported ex-post portfolio returns. This, I guess, is total return, and not excess return. Also, I assume that you need to know the risk factor realization for the future, and it’s not a pure “prediction”, right? Still, of course, very impressive correlation.

 Yes, 0.96 median correlation between predicted ex-ante and reported ex-post total return. The median correlation between predicted and actual excess return is 0.66

Another way to say this is that 0.96 is the correlation between the replicating passive factor portfolio constructed using the model and the subsequent actual portfolio returns. This replication factor portfolio does not imply any knowledge of the future factor return realizations.

 

The paper shows Apple’s 2.3 sector beta… This surprises me: over what period, and relative to what sector index?

This is beta to the technology sector index, as of 12/30/17, and is based on returns over the previous three years, with a decay factor.

Note that we analyze sector exposure separately from the Market exposure. Your intuition that AAPL has a lower overall risk is correct — its market exposure is ~1. So AAPL has ~1 Market beta and ~2 Technology beta after controlling for Market risk. The ability to measure Market and Technology exposures of AAPL independently, and not assuming that they are equal, is a critical edge of our and other statistical factor models.

Our technology factor is the cap-weighted index of all U.S. Technology stocks. It is materially identical to the Russell 3000 Technology Index. In practice, the Technology Select Sector ETF (XLK) is also a good proxy. We can share out a simplified model illustrating this relationship using AAPL, SPY, and XLK, if it would be helpful.

 

Given your example of Apple stock having a 2.2 beta to the sector, how persistent or how long can you reasonably forecast that stock keeping that beta?

The change in betas over time differs across companies. In the case of APPL, the following is its Sector Exposure (Beta) over time:

The beta changes over several regimes, but remains stable for some time within each regime. It’s interesting to note the change in tech beta in ‘07 when the iPhone was introduced and transformed the company from an idiosyncratic niche player to the driver of the industry’s profits.

What’s equally important, we know that these estimates are unbiased predictors of the subsequent realized betas. So, even as the betas change over time, our estimate at a given point neither over- or under-estimates Market and Sector betas.

 

What is the criteria for the 0.96 correlation?  For example,  is it more binary in nature either the actual hit the exact predicted number or it didn’t?  Or is there still some issue with P-hacking,  having large confidence intervals that make the actual more likely to fall within the predicated range?

This is a Pearson’s correlation (https://en.wikipedia.org/wiki/Pearson_correlation_coefficient) of predicted and actual returns. There is no funny business.

Put differently, our past factor exposures explain about 96% of the variance of subsequent monthly returns.

While it is always wise to be concerned with P-hacking, the above results have held up out-of-sample over several years, thousands of funds, for new institutions, and in new markets.

That said, the best way to prove effectiveness is to take a few sample portfolios, have us replicate them, and check for yourself how well these predictions hold up in the future as well as how they compare to the predictions of other analytics vendors and consultants. The only way to prove the effectiveness of a system or to compare it to that of other systems or processes is to benchmark all out-of-sample and compare the effectiveness of predictions.

Can you explain how the passive ETF replicating portfolio is constructed? Is there a static allocation to a group of various ETFs over time, or do the allocations and types of ETFs used get rebalanced over time?

The particular ETFs used as factors in our risk models are constant (market, sector, style, and bonds for the US model) and all are available passively, which is key. The passive component of incremental return is based on the average exposure (beta) over time (ten years assuming sufficient holding data) to each factor. The timing component is that due to variation in factor exposure, and security-selection is the residual relative to the return calculated by the model. The difference between the model’s calculated return and the portfolio’s actual reported return is also shown as trading/unexplained.

Your question goes to the core definition of active return from stock picking vs. active return from stock picking and factor timing.  You can do either:

1) If you construct a single replicating ETF portfolio and never rebalance it, then the performance of a fund relative to this portfolio would be due to both factor/market timing returns (returns due to variation in systematic risk) and alpha/residual/stock picking returns (idiosyncratic returns unattributable to systematic risk).

2) If you construct/rebalance replicating ETF portfolios periodically to capture variable systematic risk over time, then the performance of a fund relative to this portfolio would be due to alpha/residual/stock picking returns (idiosyncratic returns unattributable to systematic risk).

Over a short period, such as a few months or a few years for low-turnover managers, factor timing returns are immaterial.  The second approach, which we take, isolates stock-picking from timing, also ends up being a larger and more persistent source of active returns for most managers.

A few weeks to a few months of tracking a portfolio against a static replicating ETF portfolio without any rebalancing should be sufficient in most cases to validate the predictive value of our models.