Beat The Market - Driving Alpha With Data - Part 1
What Drives Stock Returns?
There are 3 primary drivers of return: earnings, valuation and dividends. These 3 factors can either all work in unison or move in opposite directions. In the short run the stock market is a voting machine but in the long run its a weighing machine. What this means is that short term stock fluctuations are driven by investor sentiment, driving prices higher and lower. But in the long run stock returns are more closely tied to fundamental business growth.
Basically if a company is consistently growing its earnings per share, in the long run the share price of the stock will follow the same trajectory. Investor sentiment and economic conditions will pull and tug at the share price, driving the Price to Earnings Multiple to expand or contract. And finally the company’s dividend policy will contribute to the overall stock return.
Fundamentally this is a very simple concept to understand but the constant noise and volatility of the market makes it difficult to see. Understanding what drives stock returns doesn’t necessarily help you gain an edge as an investor, or can it?
Use Data To Drive Alpha
Some of us are better investors than others, not everyone has a knack for conceptualizing qualitative factors. A good way to think about the difference between being a quantitative person versus a qualitative person is how you perceive the answer to this math problem. 2 + 2 always equals 4 or 2 + 2 doesn’t always equal 4.
I’m definitely more of a quantitative person, 2 + 2 always equals 4 to me. That being said I understand that in investing, qualitative data is equally valuable to quantitative data. It’s good to know what you don’t know or what you’re not good at. Acknowledging your shortcomings lets you focus on your strengths. This is why I steer towards applying a more data driven approach to investing.
Today I want to share with you the first part of my latest data driven approach, that ideally will deliver alpha purely from leveraging quantitative data.
Part 1 is the exploratory data analysis portion, a first step for any data scientist.
Collecting Data For The Analysis
The objective is to find a handful of financial metrics and see which ones are correlated with stock returns. Then leverage these metrics to build a stock selection model that can deliver alpha. The time periods will be the last 5 and 10 years. Here are the metrics that I chose to review.
Dividend Growth
Return on Invested Capital
Return on Capital Employed
Return on Equity
Return on Assets
FCF Margin
Gross Margin
Diluted Earnings Per Share Growth
Revenue Growth
I also want to find a universe of stocks that are potentially higher quality companies. The initial screener included the following parameters.
Exchanges - NYSE + Nasdaq
5 Year dividend growth of at least 5%
5 Year return on capital employed of at least 15%
5 Year average gross margin of at least 20%
5 Year revenue growth of at least 5%
These criteria yielded a list of 141 unique companies. The data was extracted into a spreadsheet and it’s time to explore.
The first part is cleaning up the data. Several of the selected companies were missing some of the data points I want to test, therefore they had to be deleted from the dataset.
The 5 year analysis was trimmed to just 135 companies (6 were eliminated due to lack of enough data). And the 10 year analysis was trimmed to 123 companies (18 were eliminated due to lack of enough data).
The average CAGR for the 5 year analysis was 19.81%, with the median CAGR being 18.35%.
The average CAGR for the 10 year analysis was 16.27%, with the median CAGR being 15.63%.
Since the average CAGR is higher than the median CAGR in both datasets it implies that there is skewness in the returns. This is both good and bad. It means that less than half of the stocks are outperforming most of the other stocks. It’s bad, because to generate alpha we will have to identify the handful of winners. It’s also good because if we are able to successfully identify the winners, we can generate more alpha.
To test which which metric is most correlated with a stocks return I broke up the data into 5 quintiles, sorted the dataset for each metric one at a time and looked at the average CAGR. Here is the final output for the 5 year analysis.
The table above shows the average return for each quintile for each metric. The CAGR column shows the perfect return for each quintile. Here are some initial observations.
EPS, ROIC and Revenue show the strongest correlation.
FCF Margin and Gross Margin are the weakest.
DGR, ROCE and ROE appear to be strongly correlated with the top 3 quintiles showing outperformance.
Here’s the final output for the 10 year analysis.
We can see slightly different correlations compared to the 5 year analysis. Here are my initial observations.
EPS and Revenue appear to be the strongest indicators of return with the top quintile having the best return.
Gross Margin is the weakest metric, with the FCF Margin being only slightly better.
DGR, ROIC, ROCE, ROE and ROA show promising results.
The next step in the exploratory data analysis is to see how each individual stock compares to the median measure for each metric.
In the 5 year analysis 4 stocks were better than median across all 9 metrics. Here’s how the rest ranked.
8 metrics - 8 stocks
7 metrics - 13 stocks
6 metrics - 16 stocks
5 metrics - 27 stocks
4 metrics - 22 stocks
3 metrics - 17 stocks
2 metrics -13 stocks
1 metric - 10 stocks
0 metrics - 5 stocks
In the 10 year analysis 7 stocks were better than median across all 9 metrics. Here’s how the rest ranked.
8 metrics - 8 stocks
7 metrics - 11 stocks
6 metrics - 18 stocks
5 metrics - 13 stocks
4 metrics - 24 stocks
3 metrics - 19 stocks
2 metrics -18 stocks
1 metric - 12 stocks
0 metrics - 3 stocks
The important question is what type of average return (CAGR) each subset of stocks delivered.
Here are the average returns for the 5 year analysis.
Clearly we can see a pattern where the stocks that measured better than their peers across more metrics were able to deliver stronger returns.
The same pattern was evident in the 10 year analysis as well.
Clearly investing in stocks that had superior results across more metrics relative to their peers was a smart strategy to employ. Problem solved, we have our strategy, well not quite. This was simply just the exploratory data analysis part of journey. Next is the fun part, also the hard part, building a stock selection model based on these trends.
The hard part is collecting and analyzing a large quantity of data to run backtests. The fun part is designing and tweaking the stock selection model, and testing it.
I’ll cover that in part 2 next week. Provided that part 2 works out as expected in part 3 we will design a portfolio that can be tested going forward.