The recently-released Pipeline API allows you to swiftly run computations on large universes of stocks. This creates a vast world of possibilities, one of which is the implementation of the Fama-French Three Factor Model. Computing these factors requires partitioning a large universe of stocks, which canonically involves thousands of equities: before Pipeline, this wasn't possible on the Quantopian platform. Now it is.
My implementation allows for computing rolling Fama-French factors over any time period. The accuracy of my model can easily be confirmed, because Ken French has published datasets of the Fama-French factors over various timespans. Below are some examples of Ken French's and my results:
July 2015
August 2014 - August 2015
There are, of course, discrepancies due to differing methodologies. For one, Ken French only considered data from the NYSE, AMEX, and NASDAQ exchanges, whereas Quantopian draws data from over twelve US exchanges. Arguably, my implementation offers a more holistic and complete view.
Furthermore, Ken French computed his factors strictly by calendar period (week/month/year). While it's possible to do so in Quantopian as well, it requires a little wrangling, as the native unit on Quantopian really is the business day. For the sake of simplicity, I left my script in terms of business days: augmenting it to handle particular periods is reasonably straightforward. Note that relatively small changes to the parameters of the Fama-French factors (e.g. computing them over 22, as opposed to 23 business days) can have relatively large impacts on the results, so be careful.
I hope that this algorithm is useful to you in two ways:
1. This implementation concretely illustrates a use for the Pipeline API.
2. When Pipeline is deployed to Quantopian Research, you'll be able to use variable-length Fama-French factors to regress against the returns of your algorithms, giving you further insight into your strategies.
Feel free to play around with this and share your findings if you come across anything interesting. I'm keen to see what you come up with.
One thing I'd be particularly interested in is weighting equities: in the canonical implementation, the universe is partitioned into six disjoint subsets, and equal weight is given to every equity in every subset. What's problematic with this is that you get equities that are very close to boundaries, but still carry equal weight for their categories. It might be interesting to look into weighting equities in their subsets according to distance from the center of the subset.