I just skimmed over this announcement:
https://www.quantopian.com/posts/new-data-factset-estimates
Buried within it:
Contest & Allocations
Consensus Estimates and Actuals are both available in backtesting, so you are able to use them in the contest. FactSet estimates data is the first of its kind on Quantopian. Strategies written using this dataset are likely to be uncorrelated to strategies that are already running in the contest. For this reason, we heavily encourage you to familiarize yourself with this dataset and try to enter estimates-based strategies in the contest. To further incentivize that, we are increasing the limit on the number of contest entries to 5 per person so that you can make a new entry with FactSet Estimates without having to withdraw one of your existing entries. Algorithms that use FactSet Estimates are eligible to be considered for an allocation.
All good. My question is how viable a strategy would be, if it were based solely on the new FactSet estimates data set? One important point of the excellent architectural piece by Jonathan Larkin, A Professional Quant Equity Workflow, is that quants should be thinking in terms of combining a set of alpha factors ("A successful strategy usually includes many individual alphas"). So, I would think that since the contest has been running for awhile, and we have a large field of over 300 entrants, that a significant number would have followed Jonathan's advice, and constructed multi-factor algos, and would simply add one or more estimates-based alpha factors to their already proven algos, versus submitting new algos built exclusively on estimates-based alpha factors. If this assumption is true, then winning money in the contest would require more than a single-factor algo (or one based on multiple factors, but all derived from the same data set). The contest scoring does not include weighting for being uncorrelated with the other algos in the field, so I'm just not understanding the argument that an algo based solely on the new FactSet estimates data would be successful.
Or is this all wrong-headed thinking, and I should write a contest algo based exclusively on the new FactSet estimates data set? If so, please explain.
As a footnote, I think the statement "Strategies written using this dataset are likely to be uncorrelated to strategies that are already running in the contest" needs justification (and "uncorrelated" is too strong, I'd bet, suggesting a complete absence of correlation...highly unlikely). I would consider the FactSet estimates data to be effectively public-domain within the trading industry, and so it is quite possible that the information is already largely incorporated into other data sets that are already being used in the contest (and the Q fund). There's no exclusivity to the estimates data, so why would it contain new information? But maybe my assumption that the existing data sets already contain estimates information is wrong, and we could actually see totally uncorrelated estimates-based alpha factors. Any insights?