Pravin,
I have no 'coding skills' (yet) to speak of. But, conceptually, I think what you would want to do is write code that does the following:
a) Search for, and prepare a universe of the underlying stock holdings that meet some core liquidity requirements and are core holdings of a specific sector ETF (this could then be extended to all ETF's).
b) Calculate the trailing X period look-back 'beta' (or degree of covariance) for these individual stocks to that sector ETF (not to the SPY).
c) Rank these holdings on some known and vetted basic 'stock alpha factors' vs. the index. (these will vary significantly based on your time frame... but some methods will be discussed below)
d) Select enough of these stocks (say 30) so that the basket is likely to mathematically track the underlying return factor indexes. This is actually very important for the system to work long-term.
e) Weight the stocks based on a mean variance optimization with constraints. However, we are not really interested in forecasting the returns. We are only interested in forecasting the 'excess returns' above the sector ETF while minimizing both variance (and /or DD) based on that.
f) Short the ETF in an amount to make the beta's of the long-short market neutral.
g) (For a more complex - but more interesting system, we would create a 'market view' module / function (I still don't know the difference), and then allow the 'market exposure to vary within some constraints, say from .2 beta target to -.2 beta target).
As far as the 'excess returns'. These can be driven by many, many factors. The conceptual approach I would take is to write off-line scripts that run single factor tests on baskets of 'similar historical' stocks (i.e. stocks in a similar sector and market cap range and same country with similar 'sector beta) to predict the excess returns of that factor on these 'types of stocks' in isolation.
For example EBIT/EV has trailing X look-back period 'excess returns' in this sector. Would probably want to really just use the 'recent' factor performance assuming 'momentum' and persistence to factors / styles.
For initial factors to test:
1. (EBIT/EV)
2. (blended momentum factor - say 30,60 and 90 day momentum as well as 'consistency of' momentum)
Each of the above 'composite factors' would be 'normalized' and weighted at 50%. All stocks would be ranked on their trailing period 'value-mo' score from 1-100.
You would then check how well those 'mo value' scores have predicted 'excess performance' in the past X days (say 30 and 60 days). This module would work really well.
You would use this 'prediction equation' based on the loading on these factors to generate your 'excess return' equation for the optimization.
However, would likely also put in a 'safeguard' of 'constraints' that the weights had to stay within, as well as...
A constraint that looked at how well the forecasts have been doing of late. If the forecasts have been poor,I would just use an equal variance weighting method.
Hope this helps!
There are other factors that will work better then these over very short time periods - but these above factors should work well at various rebalance time frames.
I can't code at all yet, and would be open to a collaboration if you are an expert coder and wanted to build something?
Feel free to email me here or off-line ([email protected]) and we can talk about some.
Hope this helps.
Best,
Tom