Any suggestions for improving the basic framework?
Any suggestions for improving the basic framework?
@Grant,
Looks good to me...is clean...and is pretty much the overall structure we are using. Thanks for publishing your template!
The only thing I'd add is an enhancement to do everything you are doing, but inside sectors or other types of clusters, on the hope that would allow a focus more on signal and less on noise. Grouping smaller amounts of assets together that have a common thread will allow for an overall reduction of computational power needed to use more sophisticated factor computations.
The economic thesis that I see here is an implicit assumption that your combined alpha factor uses top vs. bottom assets as an arbitrage that produces positive alpha over all time and all conditions. Even with sector/cluster confinement, this is a tall task.
We've been looking at getting arbitrage inside sectors/clusters, with more risk_on/risk_off regime signals.
No overall success yet...of course, as soon as we get success, we'll go dark!...grin...
alan
Thanks Alan -
I suppose you are saying run the factors independently on each of the 11 risk model sectors (see https://www.quantopian.com/papers/risk), using Pipeline masking?
For example, my first factor would be:
combined_alpha_materials = None
for name, f in factors.iteritems():
if combined_alpha_materials == None:
combined_alpha_materials = f(mask=universe_materials)
else:
combined_alpha_materials += f(mask=universe_materials)
Would I then sum over all 11 combined_alpha terms, yielding the final combined_alpha?
And then apply:
longs = combined_alpha.top(NUM_LONG_POSITIONS)
shorts = combined_alpha.bottom(NUM_SHORT_POSITIONS)
long_short_screen = (longs | shorts)
pipe = Pipeline(columns = {
'combined_alpha':combined_alpha,
},
screen = long_short_screen)
return pipe
Sounds relatively straightforward to code.
One thought would be to use the sector ETFs for volatility weighting of the factors.
Any that are importable can be experimented with like this
from quantopian.pipeline.experimental import BasicMaterials, CommunicationServices, ConsumerCyclical, ConsumerDefensive, Energy, FinancialServices, HealthCare, Industrials, Momentum, RealEstate, ShortTermReversal, Size, Technology, Utilities, Value, Volatility
# alone but only one week just for illustration
bmt = BasicMaterials() # - .14
com = CommunicationServices() # - 1.1
cyc = ConsumerCyclical() # - 1.5
cdf = ConsumerDefensive() # 2.5
eng = Energy() # .48
fin = FinancialServices() # - 1.2
hlt = HealthCare() # - .46
ind = Industrials() # - .72
mom = Momentum() # -1.16
rst = RealEstate() # .86
siz = Size() # .02
srv = ShortTermReversal() # - .25
tec = Technology() # -1.1
utl = Utilities() # 1.1
val = Value() # .64
vlt = Volatility() # -1.2
Hi @Grant @Alan,
I've implemented 2 scenarios of per sector alpha computation based on you comments and you algo @Grant. Both perform worse that an alpha computation that does not take into account sectors.
The difference of the 2 per sector scenarios is based on how universes are selected.
I was expecting one of the 2 per sector scenarios to perform better than the global one. Any clue what I might be doing wrong?
Thanks in advance.
Version not taking into account the sectors:
@ Marc -
I haven't tried it, but I think the idea here is that rather than running each factor over the entire QTradableStocksUS
universe and ranking/scoring, there may be an advantage in pairing factors with sub-universes to improve the signal-to-noise ratio. Hypothetically, a fundamental ratio that works well for
Technology
stocks might not work so well for Utilities
. Or perhaps certain technical indicator factors favor higher volatility stocks. Or a factor based on news would tend to be more predictive for stocks of big-name companies that tend to be covered by the press. Etc.
There's an over-fitting problem here, since, for example, if a given factor is tested for each of the sectors, one sector will work better than the others over a given time frame, but if there's no rationale, then the apparent advantage could become a disadvantage out-of-sample.