Method:
1. Take related stocks (use clustering or simply use a sector)
2. Remove PC 1 factor
3. Sum the residuals and build 2 signals.
4. Trade portfolio
Method:
1. Take related stocks (use clustering or simply use a sector)
2. Remove PC 1 factor
3. Sum the residuals and build 2 signals.
4. Trade portfolio
Hi Pravin -
I kinda get the impression that for the Q fund, anything that just uses OHLCV data will be low on the priority list (although in a mult-factor algo, I'd think that one would still want to mix in factors based solely on such data). Since in theory, Q only looks at algo "exhaust" I suppose they'll rely on self-reporting (and for a multi-factor algo, one would need to isolate the returns associated with factors based on novel data). For the contest, it is a different story (both because access to the novel data sets is limited, unless one is willing to buy them, and because one wants an algo that is capable of a nice 6-month run, which I think can be obtained with OHLCV data alone).
What's your goal here?
Regarding datasets for contest, they purportedly have 50 datasets but I don't see anything meaningful beyond 4-5 (other than fundamentals). How are we supposed to find an edge with the limited datasets?
My understanding is that it is just too expensive to offer all of the data to the masses. Makes sense. However, as I understand, Q will run backtests over the full period, up to the present, and algos would be eligible for allocations.
I guess one approach would be to get an allocation but in the contract as a "manager" insist that one then gets free access to all Q data, for use only to create algos for the fund.
Hi,
Sorry to bump such an old thread but I am curious about the thinking here.
First off: my understanding (on a high level, please let me know if I misunderstood) is that you are trying to say that within an industry has some kind of driver (in this case the first principal component), and all companies here should be reverting to a value predicted by this component.
So what you want to do here is to size your bets according to their deviation from the value predicted by the component.
That being understood I still cannot understand this line:
R = sp.stats.zscore(model.resid[-2:, :].sum(axis=0)) - sp.stats.zscore(model.resid[-20:, :].sum(axis=0))
I cannot really understand why the signal is the difference between two zscores 19 days apart, or am i missing something here?