After the Live Tearsheets Review webinar (by Jess) I saw many users (Anthony, Joakim, James) create their own posts and saw their attempts at addressing their takeaways from the webinar.
My own takeaway from the webinar was different. Before the webinar it was not clear to me how the risk model fit into the evaluation process. Earlier posts (early 2018) on risk model provided a clouded picture, I was left scratching my head if I need to zero out my common returns as some users were doing in posts on the forum or get into full debug mode and see why some unintentional exposures are getting flagged in my tear sheets. After the webinar it is clear that Quantopian evaluation team prefers not to see consistent exposure to any risk factor.
I also got a holistic perspective of risk. It appears that is the main focus of the webinar. As an afterthought after the webinar I realized risk comes in different forms, not just risk factors. More so in how we approach strategy building and what decisions we take at each point starting from factor creation to alpha combination. I revisited every decision that has risk implications and reworked it so that the central theme is to minimize risk of that decision point.
I am (like James) also in favor of statistical significance (of long period backtests) and incorporating trading costs into account while factor building (a reason why I haven't been able to adopt AlphaLens).
Another feedback I got that I think has a lot of merits is using statistically significant multi year holdout periods (like 50:50 division). I will do that going forward. I haven't done that in this algorithm development, more because I didn't realize that was something that was expected. From the lecture on overfitting I watched a year ago - I walked away thinking that appears to be some kind of fraudulent practice that cheaters do. But my new understanding is that fitting that happens unconsciously in our attempt to meet the requirements are only to be found by installing a process before strategy development starts, and part of that process is having a meaningful multi year holdout data.
I have spent 14 months on this single algorithm. It uses multiple factors. Each equity in my universe derived from QTradableStocksUS is ranked in each factor every day and alpha combination works across all factors. Market regime is not detected or used in the algorithm in any form. I use same processing routine every day. It takes about 25 second a day to go through all computations so I am having to chunk my tearsheets into 3 year periods which is what I can do without hitting long running backtest timeouts that have become common nowadays.
So there are five tear sheets one each for a three year period starting 2014.
I'd appreciate if you find any issues (red flags) that I should address. Constructive feedback/different viewpoints are also welcome. I have submitted this version to the contest. I don't expect to change the logic much further unless a red flag is found, which I am hoping the community members who are good in tear sheet evaluation can help me with. Thanks to Joakim for suggesting I use hide_positions option.