Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Magic Formula

Magic formula has been discussed in this forum in the past, but not much backtest results have been shared so far. Here I've implemented Greenblatt's strategy, with minor modifications such as filtering out mining and pharmaceutical companies. I've run backtests in segments with different market cap ranges, which showed that eliminating small cap stocks under a billion dollar cap improves the overall return. Small cap baskets tend to get destroyed by a number of companies losing more than 30 percent of their values. As per Greenblatt's remark, the strategy has periods of underperformance compared to S&P, but in long run it does seem to come out slightly ahead.

I'm now interested in comparing the predictive power of fundamental ratios. The original formula weighs return on investment (ROI) and earnings yield (EY) equally, but I found some discussions arguing EY should be weighed more heavily. Ideally I would like to do some regression analysis on these two ratios and other fundamental metrics, but It's difficult to find free historical fundamental data to test this thesis, so I'm wondering if someone here with some experience can chime in.

14 responses

Hi, Jonh.
Thank you for sharing your code. It was the great starting point for me in learning Quantopian.
But, It seems that we can't use any more get_fundamentals() which is in your code.
I tried to modify your code that uses pipeline and I backtested it for same period.

I just modified 3 lines and add make_pipeline() in your code.
modified below.

  1. for stock in context.stocks.index: to for stock in context.output.index: (line 52 in your code) (To use pipeline's output.)
  2. data[stock].price: to data.current(stock, 'price'): (line 60 in your code) ( Because it looks that data[stock].price was deprecated.)
  3. context.output=pipeline_output('my_pipeline').sort_values(by='MF_rank', ascending=True).head(int(context.capacity)) (before_trading_start() in your code)

added below.

  1. my_pipe = make_pipeline()
  2. attach_pipeline(my_pipe, 'my_pipeline')
    (initialize() in your code)
  3. make_pipeline(): ...

But the result had some quite difference.(My result got low returns versus Benchmark's and got high MaxDrawdown.)

Could you please look at my code and help me?

Best to rank using the mask:

EY_rank = earnings_yield.rank(ascending=False, mask=filter_plus)  
roic_rank = roic.rank(ascending=False, mask=filter_plus)

Remove close.latest not in original.

This is a little different.

Thanks, Blue Seahawk.
Your answer is really helpful for me!

@K, thank you for mentioning it, and here are some more possibilities then too.
Mainly run this and take a look at the log window. Visibility into the pipeline values.
This is just a start toward adding short if you wish, it would need some work.

The focus here is to provide options, tools & flexibility. For example, class Wild() for quick development in trying things, normalization of positive and negative weights separately to be able to add shorting if you wish (that's where things went south with that last minute addition of norm()), logging of pipeline min, mean, max and some highs & lows, forward filling of nans (addition of class was necessary there, to have a window to work with, rather than just latest), an example of percentile_between you might want to try some numbers in, examples of zscore, demean (as one way to obtain some negative values for short shares), a little bit more efficient route for 'today', the efficient pnl determination for long & short simultaneously helps makes an addition of shorting easier to work with if interested in moving toward qualifying in the contest for example.

Returns were not the point so this backtest is only a few days. Rather than going with this algo, you could use it to copy/paste various bits over to yours in trying some things.

Why are you ascending=False?
EY_rank = earnings_yield.rank(ascending=False, mask=filter_plus)
Wouldn't the higher earnings yield be better value?

This strategy is so weird. If you change the "ascending" setting in the line 98 in Blue's code as below, it means that you are selecting the worst companies in the output list (~750 companies) to go Long. The performance is still pretty good. However, if you go Short, it performs terribly.

context.output=pipeline_output('my_pipeline').sort_values(by='MF_rank', ascending=True).head(int(context.capacity)).dropna()  

to

context.output=pipeline_output('my_pipeline').sort_values(by='MF_rank', ascending=False).head(int(context.capacity)).dropna()  

It outperforms here due to higher beta. <=1% alpha is something, but I imagine it can be improved with a few tweaks here.

If you need free historical fundamentals data, I run TenQuant.io. Feel free to check it out; it retreives data in real time from Edgar.

Hi guys,
I worked for some days on the magic formula starting from this post (really appreciated).
after many modifications and backtests, I found a really weird behaviour, based on the month in which the stocks are rotated:
the strategy results change heavily if the stock buys/sells are made in a different month from January (as proposed in the examples of this discussion).
I took the code of Blue Seahawk and backtested it using all the 12 months, here's the results:
https://drive.google.com/file/d/1Gzzrpw8ygGy7d9gFyo4tz3IT9XYpWatb/view?usp=sharing

Moreover, I have made many changes on the algorithm (using interval 2003-2020 instead of 2011-2017, using FCF yield instead of ebit/EV, using a custom formula to extract ROCE instead of ROIC, and other minor changes), but the results have the same behaviour: really good on january, good on the ending months of the year, and really bad if the rotation is made on spring/summer:
https://drive.google.com/file/d/1u6AnsqsqSLXHuZ5EkGwnyBKuNgSTB5FI/view?usp=sharing

I can't find an explanation of this behaviour (it doesn't seem random). How can it happens?

@Luca Wiegand Low frequency trading is very susceptible to initial state values. Think of trying to fly a plane from New York to Los Angeles. If the pilot adjusts his course every minute then chances are the flight path will be quite straight with a lot of tiny corrections. On the other hand, if the pilot only adjusts his course every hour, the flight path will be much more erratic and potentially quite far off a 'straight line' at times. Moreover, if the direction calculations are off a bit, they will have a much greater impact when adjusting only hourly.

Conventional wisdom is the markets don't perform well in the spring and summer. There is the adage "sell in May and go away" . The trading in spring and summer isn't reflective of the other parts of the year. So, similar to the flight example above, if one bases trading 'direction decisions' only on data from those months then results may be significantly different from a 'straight line'.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi Dan, thanks for the response.
I can agree on the fact that low frequency tarding can be more susceptible, but in this case the results each year should not have a bias towards one specific month, but it should be random.
If you check the results each year for choosing the stocks on january and on july:

https://drive.google.com/file/d/1b_BLQc2XhBd6RuGvZXk8df2x6jO7Lh2O/view?usp=sharing

The january strategy beats july 13 times over the 16 total.

Regarding the underperformance in spring and summer, even if it could be true only in some years, all the data used in the strategy are annual (LTM), except for the ones that came from balance sheet. So I don't find a reason why selecting stocks in an underperforming period can lead to underperformance for the whole year

I attach a backtest if someone want to have a look

The conventional wisdom of "sell in May and go away" is sort of premised on the fact that most companies fiscal year ends in December. Results come out in January, and the markets digest those results in February. Companies which had done well are rewarded with higher share prices while those which didn't have lower share prices. Conventional wisdom is that good stocks are often at their peak by May while poorer performing stocks are at their low. Hence, 'sell in May' assumes you had some good stocks so sell at their peak.

Consider what this strategy does. It buys 'good stocks' and sells 'bad stocks'. After earnings season, during the months of May through the summer, the 'good stocks' are at their highs while 'bad stocks' are at their lows. By rebalancing during these months the algo effectively buys 'high' and sells 'low'. Not the formula for a winning strategy.

However, consider moving that rebalancing forward before earnings season. The 'good stocks' haven't been run up and the 'bad stocks' haven't been dragged down. The algo, as noted, uses annual fundamental data for the most part. The picks for 'good' and 'bad' stocks therefore won't change a lot whether before or after earnings season. The algo will probably be trading about the same stocks. That hasn't changed. What has changed? The price. By trading before earnings season one has a fighting chance at buying low and selling high. That is a much better formula for a winning strategy.

All this of course are generalities and there are of course many exceptions. But stock valuations, on average, are not random with highest and lowest valuations being December through February. Check out the attached notebook showing the number companies reaching their min and max PE by month over the 10 years 2010-2019.

@Dan,

Why is there only 10 bars on the histogram?

@Vladimir Good catch regarding the number of bars. I got a lazy and used the default bars=10 for the hist method. Setting bars=12 is more appropriate. Doing that also changes the graph to not show such explicit peaks in January. It still shows definite variation but just not so pronounced. Again good catch.

Attached is an updated notebook.

@Dan,

I created "Seasonality of min and max monthly return", using your notebook.
Hope it will be useful.

Please respond to my request.