## Risk and Skill Highlights: DODFX

## PeerTrac: Performance Evaluation and Benchmarking

## DFA Peer Company Risk Analysis

## Asset Liability Modeling DFA/ALM

Peer Analytics’ consultants have been assisting clients in the asset allocation process for over thirty years. Together we have conducted over 250 asset/liability studies and developed asset allocation models for virtually every type of investment organization.

Michael Kantor and Dave Newsom developed the first stochastic asset allocation model for insurers in 1989.

We provide both strategic asset allocation consulting and license our cloud-based, user-friendly, transparent DFA/ALM models.

Dynamic Financial Analysis (DFA) or Asset Liability Modeling (ALM) is a stochastic simulation methodology which quantifies multiple forms of risk by simulating company financial results thousands of times for each asset mix, so that all potential outcomes associated with individual asset mixes can be considered in advance. The analysis embodies a complete insurance company financial model, and considers all of the interrelationships between the asset and liability sides of the business. The model provides both an expected value and a distribution of possible values for each of the parameters evaluated. The approach allows clients to evaluate the relationships among the multiple dimensions of risk: asset risk, underwriting risk, reinsurance risk and business risk.** **

The approach takes into account all of the variables which affect the financial results of the company. By simulating investment, underwriting and premium growth results we are able to assign probabilities and provide ranges of potential outcomes given changes in variables. This provides decision makers with an analysis that evaluates the interrelationship among all the various risks that the company faces, rather than simply considering investment risk in isolation.

The analysis offers an integrated perspective of risks, rather than the classic financial analysis in which different aspects of the company were considered in isolation from each other. Specifically, DFA / ALM models the reactions of the company in response to a large number of interrelated risk factors, including both underwriting risks – detailed by lines of business – as well as asset risks.

The model simulates inflation, interest rates and the shape of the Treasury yield curve, credit spread levels and their corresponding volatility, stock and bond market returns, the impact of reinsurance, costs to settle liability claims and premium growth rates.

Interest rates and yield curve shape changes are modeled using the Cox, Ingersoll, Ross methodology which enforces necessary lognormality and mean-reversion. Yields and credit spreads, both absolute level and changes, drive bond returns, which in turn impact the various equity class returns.

These fundamental asset return drivers are combined with corresponding cash flow patterns and simulated thousands of times to develop net cash flow patterns for the company. All of this is done within a framework facilitating the generation of financial statements over the specified pro forma period.

The following liability assumptions can be made using statutory Schedule P as a starting point and adjusted/changed as desired.

Premiums are defined as the product of Rate Change, Exposure Change, and Trend draws from separate uniform distributions. A Trend factor provides for a worst-case possibility that rate changes develop a downward trend. Users can define a Percent of Worst assumption as well as a distribution minimum and maximum for Trend.

Losses are defined as the product of a Severity (lognormal) and a Frequency draw (Poisson). Loss frequency rates may include a user assumed potential tendency to Spiral.

LAE are defined as the product of a Severity draw (lognormal) and a Frequency draw (Poisson). LAE frequency rates may include a potential trend.

ULAE increase with Inflation draw plus any user defined adjustment.

The model reflects the full financial structure of the sample company, including the impact of regulatory and tax structures, which allows projections to be made for the balance sheet and for the profit-and-loss account of the client company.

Capital market assumptions (risk, return, yield, duration and correlation) for individual asset classes and for inflation are made by Peer Analytics based on a combination of Capital Market Theory, historical relationships across asset classes, and current market conditions. Our capital market assumptions are intended to be tactically neutral and conservative. We typically model one or two alternative sets of capital market assumptions to depict the sensitivity of results to assumptions. These may be supplied by the client, third party advisors, or by Peer Analytics.

Liability assumptions for premium growth, loss ratios, reserve payout patterns and reinsurance strategy by line of business are made either by Peer Analytics based on review of historical statutory data or in conjunction with the client actuary. Further details are here.

Risk/reward relationships for a range of potential asset mixes are defined beginning with the client’s current asset mix as well as alternative more diversified asset mixes.

Finally, we include a DFA analysis of client peer companies (DFA Peer Risk analysis) to describe client’s asset, liability and income/balance sheet risk relative to those same risk positions of individual peer companies.

Clients gain an understanding of the impact of current risk postures on potential future operating results, as well as how a change to investment strategy will impact the range of future surplus, net income and risk-based-capital levels. Clients will be able to communicate a clear and objective rationale for any changes in investment policy.

## Isolating Active Risk and Skill

## A New Approach to Oversight

*Properly isolating active contribution from passively-available exposures reveals true active risk, persistent security-selection skill, and unintentional risk exposures or gaps.*

###### .

**Concept**

Equity risk models are designed to measure risk as completely as possible for even the most concentrated portfolios. Popular models today use over 100 underlying risk factors, most of which are not investable. While effective at measuring risk, these models can be complex, expensive, and difficult to interpret...

They are also ill-suited for attribution: knowing that you underperformed because you were over-exposed to momentum or leverage at the wrong time is not terribly meaningful and certainly not actionable...

On the other hand, a model designed specifically for oversight, with a limited number of passive ETF risk factors, measures risk as well for most portfolios, but can also distinguish active contribution (security-selection, timing, and trading) from the impact of passive exposure differences from benchmark.

Consistent differences from the benchmark in passively-available exposures (market, sector, size, value, and bonds for a U.S. model) explain two-thirds of incremental return for the median portfolio.

*These passive differences from the benchmark, whether or not intended, are not part of an active manager’s contribution. They can be freely obtained if intentional, or offset in advance with ETFs or through a multi-manager structure if unintentional.*

**Value **

- When properly measured, skill persists. Negative skill strongly persists. Managers with top decile stock-selection skill are twice as likely to outperform in subsequent years. Managers in the bottom decile are more than twice as likely to underperform.

- One-third of active equity funds are closet indexers taking too little real active risk to offset their fees, even with skill. These managers can only be identified with a properly defined risk model.

- Passive exposure differences from the benchmark explain the majority of incremental return for most active portfolios. Unintended exposures can be freely offset.

**Proof**

Equity risk models can be mathematically complex and hard to compare. Fortunately, these models are easily tested..

To evaluate the accuracy of an equity risk model, we compare returns predicted by past factor exposures to subsequent portfolio performance: We measure factor exposures using end-of-month holdings and predict the following month’s return.

The correlation between predicted and actual return measures a model’s accuracy. The higher the correlation, the more effective a model is at hedging, stress testing and scenario analysis, as well as evaluating investment *risk* and *skill*.

It’s not necessary to rely on the out-of-sample testing we’ve published; we’re happy to provide passive ETF replicating portfolios for any of your managers so you can evaluate our models’ accuracy against future realized returns.

.

.

.

.

.

.

.

.

.

.

.

.

## Why Are Investment Performance Reviews So Boring

**Why ?**

*Three steps to meaningful, actionable performance reviews*

We’ve seen hundreds of investment performance reviews over the years. Most were a waste of time.

The typical investment performance review to an insurance company board is a perfunctory, boring recap of statistics; everyone politely waits until it’s time for something interesting — like the investment manager‘s favorite stock picks or market forecasts.

Consider the equity manager assessment of a recent performance review: The manager underperformed the S&P 500 Index by 3.3 percent in the latest year, outperformed by 0.3 percent annualized over the three-year time period, and underperformed by 1.4 percent annualized over the latest five-year time period, but with lower volatility than the index. What conclusions can be drawn from this type performance recap? None. ^{1}

So why review investment performance in the first place?

*The primary function of prudent oversight — and the first responsibility of every board member — is to continually ask the question: “Is everything still OK ?”*

Effective performance evaluation is the only means to answer that most critical oversight question. An effective performance review is actionable. It provides the necessary data and context in sufficient detail to determine either 1) everything is still OK, or 2) there is a problem – and we’ve identified it.

**Problem: Insufficient Context**

Every investment strategy, portfolio and investment manager is subject to randomness and will underperform the relevant benchmark index during some time periods. The question then will be: “Is this underperformance an indication of a problem, or is it simply randomness?” If that question cannot be answered, the review was meaningless.

The unspoken difficulty in evaluating investment performance is that the distribution of randomness is significantly greater than the distribution of investment skill. The signal is small, the noise is large. The result is that it is very difficult to distinguish skill – positive or negative – from randomness. In fact, when evaluating an equity manager’s performance relative to a market index it takes between 40 and 70 years until there is sufficient data to infer a statistically meaningful conclusion regarding manager skill (see, for example: Rosenberg, Kritzman or Beebower ^{2}). That is likely too late.

How much would your manager need to underperform the index to cause you to make a change?

Because the typical insurance company review compares performance solely to market indexes, results are neither meaningful nor actionable. Board members are bored for a good reason.

**Step 1: Additional Data**

Peer Universes – distributions of returns of “like-type” portfolios – responds to the lack of data context helps distinguish between a potential problem and normal randomness. For example, consider an equity manager whose annual return is six percentage points below his benchmark index.

In some years six percentage points below the market index equates to the bottom six percent of all similar equity managers, but in other years that same performance relative to the index is above the 35th percentile. Whether a manager is in the bottom 50 percent or the bottom five percent of other equity managers suggests very different conclusions. Without the added context of peer universes, the distinction is lost.

**Survivor Bias**

While peer universe data is a critical component of a useful performance review, it is also important that peer universes are comprised of truly “like-type” portfolios. Property-casualty company investment returns should be compared to those of other property-casualty company portfolios and not to distributions of mutual fund or pension fund returns. Pension funds are not subject to taxes and both pension and mutual funds are subject to very different constraints than insurance companies. Neither represent apples-to-apples comparisons.

But a more significant problem is survivor bias. Investment management firms often discontinue funds with poor performance. The returns of these funds cannot be included in multiyear return universes, resulting in an upward bias of return distributions over time. Numerous studies of the impact of survivor bias on peer return distributions show a bias in median returns of two to five percentage points over three- and five-year time periods. Survivor bias can be significant enough to eliminate the value of peer universe comparisons for multi-year periods.

A distinct advantage of property-casualty, health, or life, peer universes is that insurance companies are not discontinued due to poor relative investment performance and, as a result, return distributions calculated at the insurer level do not suffer from survivor bias. Insurance companies are in an ideal position to profit from peer universes in their performance review.

**Step 2: A Question of Style**

When performance is too low (or too high) relative to the peer universe, the next level of analysis is required.

A deeper analysis of performance is required only for periods of extreme relative performance.

A manager in the bottom decile in a particular time period may or may not be a problem, and if there is a problem it may be the manager or it may be a problem with understanding between the manager and the board. While there is no magic cutoff, in general, performance in the bottom ten percent (or in the extreme top of the distribution, by the way ^{3}) suggests more investigation is necessary.

Performance evaluation should be done with a hierarchical approach. If the manager in all time periods reviewed is within some range within the distribution of peer returns – say above the tenth percentile and below the 95^{th}^{ }– deeper analysis is less critical. But if performance is below (or above) a certain threshold in a particular time period, the next level of investigation should be undertaken.

Equity managers, for example, tend to focus on different sectors of the overall equity market — large capitalization stocks, small cap stocks, growth stocks, value stocks, core (the entire equity market), sector rotator (aka, a core manager with extra alpha risk) and international stocks (more separate asset class than different y style). Poor relative performance may be a reflection of an equity style that is “out-of-favor” rather than a problem manager. The next step in the review process is to define the manager’s style and then evaluate his performance within a distribution of returns of managers with the same style. Equity manager style is defined quantitatively either by regressing the manager’s returns over time against market style indexes or, for a fixed point in time, by analyzing the style of individual portfolio holdings.

A manager in the bottom ten percent of all equity managers when his particular style is out-of-favor may be well above median when considered within a distribution of managers with the same equity style. A manager performing well within his style universe is clearly not a problem unless the manager’s style is other than expected, or has changed substantially over time, in which case there may be a manager problem or a communication problem.

A manager in the bottom (or extreme top) of his style distribution may indicate a problem manager or it may be a manager with a very concentrated portfolio (with the associated increased exposure to randomness and resulting high volatility of relative performance). The latter manager would be appropriate within some portfolio structures, but problematic within others.

**Step 3: Identify Manager “Bets”**

The final level in the review process of a potential problem manager is a detailed factor analysis of portfolio holdings at multiple points in time over the period of poor performance. Factor analysis is a statistical modeling method in which incremental return relative to a neutral benchmark is explained by decomposing residual portfolio variance into contributions from individual positions. This analysis identifies the precise exposure bets that lead to the poor results. With this information in advance of a manager meeting, the board can determine whether the bets were intentional, with a satisfactory explanation, or unintentional — suggesting a change is in order.

The only way board members can fulfill their oversight responsibility by adequately answering the question “Is everything still OK?” is with effective performance reviews. It’s time for insurance company boards to insure that their investment performance review is worth their attention.

- See www.peeranalytics.com/index-limitations/
- FARQUHAR, T., ROSENBERG, B. and RUDD, A. (1982), Factor-Related and Specific Returns of Common Stocks: Serial Correlation and Market Inefficiency. The Journal of Finance
- Performance in the top tail of the distribution may indicate a manager exposing the portfolio to unacceptable risk.

## Hedge Fund Mean Reversion

*With predictive analytics and a robust model, investors can not only identify persistently strong stock pickers but also construct portfolios with predictably strong nominal performance.*

.

Our earlier articles explored *hedge fund survivor (survivorship) bias* and *large fund survivor bias*. These artifacts can nearly double nominal returns and overstate security selection (stock picking) performance by 80%. Due to these biases, future performance of the largest funds disappoints. The survivors and the largest funds have excellent past nominal performance, yet it is not predictive of their future returns due to *hedge fund mean reversion*, a special case of *reversion toward the mean*. Here we explore this phenomenon and its mitigation.

We follow the approach of our earlier pieces that analyzed hedge funds’ long U.S. equity portfolios (*HF Aggregate*). This dataset spans the long portfolios of all U.S. hedge funds active over the past 15 years that are tractable using 13F filings.

## Mean Reversion of Nominal Hedge Fund Returns

To illustrate the mean reversion of nominal hedge fund returns, we have assembled hedge fund portfolios with the highest and lowest trailing 36-month performance and track these groups over the subsequent 36 months. This covers the past 15 years and considers approximately 100 such group pairs.

If strong historical performance is predictive, we should see future (ex-post, realized) outperformance of the best historical performers relative to the worst. This would support the wisdom of chasing the largest funds or the top-performing gurus.

The following chart tracks past and future performance of each group. The average subsequent performance of the historically best- and worst-performing long U.S. equity hedge fund portfolios is practically identical and similar to the market return. There is some difference in the distributions, however: highest performers’ subsequent returns are *skewed to the downside*; lowest performers’ subsequent returns are *skewed to the upside*:

Prior 36 Months Return (%) | Subsequent 36 Months Return (%) | |

High Historical Returns | 52.84 | 28.17 |

Low Historical Returns | -11.43 | 28.33 |

Thus, nominal historical returns are not predictive of future performance. We will try a few simple metrics of *risk-adjusted performance* next to see if they prove more effective.

## Sharpe Ratio and Mean Reversion of Returns

*Sharpe ratio* is a popular measure of risk-adjusted performance that attempts to account for risk using return volatility. The following chart tracks past and future performance of portfolios with the highest and lowest historical Sharpe ratios. The average future performance of the best- and worst-performing portfolios begins to diverge, though we have not tested this difference for statistical significance:

Prior 36 Months Return (%) | Subsequent 36 Months Return (%) | |

High Historical Sharpe Ratios | 43.65 | 28.38 |

Low Historical Sharpe Ratios | -8.34 | 25.66 |

Note that portfolios with the highest historical Sharpe ratios perform similarly to the best and worst nominal performers in the first chart. However, portfolios with the lowest historical Sharpe ratios underperform by 2.5%. Sharpe ratio does not appear to predict high future performance, yet it may help guard against poor results.

## Win/Loss Ratio and Mean Reversion of Returns

Sharpe ratio and similar *parametric* approaches make strong assumptions, including *normality of returns*. We try a potentially more robust *non-parametric* measure of performance free of these assumptions – *the win/loss ratio*, closely related to *the batting average*. The following chart tracks past and future performance of portfolios with the highest and lowest historical win/loss ratios. The relative future performance of the two groups is similar:

Prior 36 Months Return (%) | Subsequent 36 Months Return (%) | |

High Historical Win/Loss Ratios | 26.93 | 27.59 |

Low Historical Win/Loss Ratios | 1.70 | 26.53 |

Win/loss ratio does not appear to improve on the predictive ability of Sharpe ratio. In fact, both groups slightly underperform the low performers from the first chart above.

## Persistence of Hedge Fund Security Selection Returns

Nominal returns and simple metrics that rely on nominal returns both suffer from mean reversion, since *systematic(factor) returns* responsible for the bulk of portfolio volatility are themselves mean reverting. Proper risk adjustment with a robust risk model that eliminates systematic risk factors and purifies the residual addresses this problem.

AlphaBetaAnalytics’ measure of this residual security selection performance is *αReturn* – outperformance relative to a replicating factor portfolio. αReturn is also the return a portfolio would have generated if markets had been flat. The following chart tracks past and future security selection performance of portfolios with the highest and lowest historical αReturns. The future security selection performance of the best and worst stock pickers diverges by over 10%:

Prior 36 Months αReturn (%) | Subsequent 36 Months αReturn (%) | |

High Historical αReturns | 60.90 | 5.65 |

Low Historical αReturns | -35.66 | -4.58 |

## Security Selection and Persistent Nominal Outperformance

Strong security selection performance and strong αReturns can always be turned into nominal outperformance. In fact, a portfolio with positive αReturns can be hedged to outperform any broad benchmark. Nominal outperformance is convenient and easy to understand. These are the returns that investors “can eat.”

The following chart tracks past and future nominal performance of portfolios with the highest and lowest historical αReturns, hedged to match U.S. Equity Market’s risk (factor exposures). Hedging preserved security selection returns and compounded them with market performance: future performance of the two groups diverges by over 11%:

Prior 36 Months Return (%) | Subsequent 36 Months Return (%) | |

High Historical αReturns | 81.70 | 32.50 |

Low Historical αReturns | -28.93 | 21.41 |

Note that, similarly to Sharpe ratio, αReturn is most effective in identifying future under-performers.

Thus, with predictive analytics and a robust model, investors can not only identify persistently strong stock pickers but also construct portfolios with predictably strong nominal performance.

## Conclusions

- Due to hedge fund mean reversion, future performance of the best and worst nominal performers of the past is similar.
- Re-processing nominal returns does not eliminate mean reversion. However, Sharpe ratio begins to identify future under-performers.
- Risk-adjusted returns from security selection (stock picking) persist. A robust risk model can isolate these returns and identify strong future stock pickers.
- Hedging can turn persistent security selection returns into outperformance relative to any benchmark:
- Hedged portfolio of the best stock pickers persistently outperforms.
- Hedged portfolio of the worst stock pickers persistently underperforms.

## Returns-Based Style Analysis: Overfitting and Collinearity

*Plagued by overfitting and collinearity, returns-based style analysis frequently fails, confusing noise with portfolio risk.*

*Returns-based style analysis** **(RBSA) is a common approach to investment risk analysis,** **performance attribution**, and skill evaluation. Returns-based techniques perform** **regressions** **of returns over one or more historical periods to compute portfolio** **betas** **(exposures to systematic risk factors) and** **alphas**(residual returns unexplained by systematic risk factors). The simplicity of the returns-based approach has made it popular, but it comes at a cost – RBSA **fails for active portfolios**. In addition, this approach is plagued by the statistical problems of** **overfitting** **and** **collinearity**, frequently confusing noise with systematic portfolio risk. *

**Returns-Based Style Analysis – Failures for Active Portfolios**

In an earlier article we illustrated the flaws of returns-based style analysis when factor exposures vary, as is common for active funds:

- Returns-based analysis typically yields flawed estimates of portfolio risk.
- Returns-based analysis may not even accurately estimate average portfolio risk.
- Errors will be most pronounced for the most active funds:

- Skilled funds may be deemed unskilled.
- Unskilled funds may be deemed skilled.

These are not the only flaws. We now turn to the subtler and equally critical issues – failures in the underlying regression analysis itself. We use a recent Morningstar article as an example.

**iShares Core High Dividend ETF (HDV) – Returns-Based Style Analysis**

A recent Seeking Alpha article provides an excellent illustration of problems created by*overfitting* and *collinearity*. In this article, Morningstar performed returns-based style analysis ofiShares Core High Dividend ETF (HDV).

Morningstar estimated the following factor exposures for HDV using the *Carhart model*:

*iShares Core High Dividend ETF (HDV) – Estimated Factor Exposures Using the Carhart Model – Source: Morningstar*

The *Mkt-RF* coefficient, or loading, is HDV’s estimated market beta. A beta value of 0.67 means that given a +1% change in the market HDV is expected to move by +0.67%, everything else held constant.

The article then performs RBSA using an enhanced *Carhart +** **Quality Minus Junk (QMJ)* model:

*iShares Core High Dividend ETF (HDV) – Estimated Factor Exposures Using the Carhart + Quality Minus Junk (QMJ) Model – Source: Morningstar*

With the addition of the QMJ factor, the market beta estimate increased by a third from 0.67 to 0.90. Both estimates cannot be right. Perhaps the simplicity of the Carhart model is to blame and the more complex 5-factor RBSA is more accurate?

**iShares Core High Dividend ETF (HDV) – Historical Factor Exposures**

Instead of Morningstar’s RBSA approach, we analyzed HDV’s historical holdings using the *ABW Peer Analytics** **U.S. Equity Risk Model*. For each month, we estimated the U.S. Market exposures (betas) of individual positions and aggregated these into monthly estimates of portfolio beta:

*iShares Core High Dividend ETF (HDV) – Historical Market Exposure (Beta)*

Over the past 4 years, HDV’s beta varied in a narrow range between 0.50 and 0.62.

Both of the above returns-based analyses were off, but the simpler Carhart model did best. It turns out the simpler and a less sophisticated returns-based model is less vulnerable to the statistical problems of multicollinearity and overfitting. The only way to find out that returns-based style analysis failed was to perform holdings-based analysis using a multi-factor risk model.

**Statistical Problems with Returns-Based Analysis**

**Multicollinearity**

Collinearity (Multicollinearity) occurs when risk factors used in returns-based analysis are highly correlated with each other. For instance, small-cap stocks tend to have higher beta than large-cap stocks, so the performance of small-cap stocks relative to large-cap stocks is correlated to the market.

Erratic changes in the factor exposures for various time periods, or when new risk factors are added, are signs of collinearity. These erratic changes make it difficult to pin down factor exposures and are signs of deeper problems:

*A principal danger of such data redundancy is that of overfitting in regression analysis models.*

**Overfitting**

Overfitting is a consequence of redundant data or model over-complexity. These are common for returns-based analyses which usually attempt to explain a limited number of return observations with a larger number of correlated variable observations.

An overfitted returns-based model may appear to describe data very well. But the fit is misleading – the exposures may be describing noise and will change dramatically under minor changes to data or factors. A high *R squared* from returns-based models may be a sign of trouble, rather than a reassurance.

As we have seen with HDV, exposures estimated by RBSA may bear little relationship to the portfolio risk. Therefore, all dependent risk and skill data will be flawed.

**Conclusions**

- When a manager does not vary exposures to the market, sector, and macroeconomic factors, returns-based style analysis (RBSA) using a parsimonious model can be effective.
- When a manager varies bets, RBSA typically yields flawed estimates of portfolio risk.
- Even when exposures do not vary, returns-based style analysis is vulnerable to multicollinearity and overfitting:

- The model may capture noise, rather than the underlying factor exposures.
- Factor exposures may vary erratically among estimates.
- Estimates of portfolio risk will be flawed.
- Skilled funds may be deemed unskilled.
- Unskilled funds may be deemed skilled.

- Holdings-based analysis using a robust multi-factor risk model is superior for quantifying fund risk and performance.