Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
multi-factor algo example

Here's an algo for your consideration. Will post tear sheet next. Appears to be good-to-go for the contest.

11 responses

Tear sheet for backtest above.

Try to address the -9% dip with 2013 start (three year loss).

And give 2008 a go, where Q might want to know about this error msg around Aug 26 where the NaN col in RiskModelExposure() is hidden:

ValueError: NaN or Inf values provided to FactorExposure for argument 'loadings'.  
Rows/Columns with NaNs:  
  row=Equity(32430 [SHA]) col='industrials'  
  row=Equity(32430 [SHA]) col='momentum'  
  row=Equity(32430 [SHA]) col='short_term_reversal'  
  row=Equity(32430 [SHA]) col='size'  
  row=Equity(32430 [SHA]) col='value'  
  ... (1 more)  
Rows/Columns with Infs:  
  None  
There was a runtime error on line 213.  

For workaround, consider something like
context.risk_loading_pipeline = pipeline_output('risk_loading_pipeline') .dropna()

Holy smokes, with that workaround I added above and 2008 start, not only did it get thru the crash but also on Aug 26 instead of down -2.7% is up +1% and reaches +38% in May 2010. This implies work needed by Q on the risk model constraint and possibly an easy improvement.

Why don’t you post the backtest?

Academically, Grant just wondering what it means in terms of IP issues when an algorithm adapted from a published research paper that has been in the Quantopian Community for critiques and inputs for years, and open to competitors in full public access?

@ Karl -

I don't have a clue on the IP question. If you are looking for permission to use anything that I've posted on Quantopian, I hereby grant it (permission is automatically granted by the Quantopian Terms of Use). Presumably, Quantopian has to address this issue formally when they license algos. The algo I posted above would be an interesting test case. I have to figure that Quantopian will provide some legal advice regarding copyright/licensing/IP to authors, upon request.

Here's a longer backtest, with the suggestion above:

context.risk_loading_pipeline = pipeline_output('risk_loading_pipeline').dropna()

I also dropped the Direction factor that was not being used.

Using nanfill on Volatility. The lower results seem to be saying the more nans, the higher the results will be. 11% forward filling nan. 29% leaving nans alone. Looks like a count of around half a million nan on first day, no idea why it reports three times each. Original date range ...

One mystery is why there are nans in the first place. I'm a bit confused. I would have expected with the QTradableStocksUS and Pipeline, we'd be nan-free. What am I missing? Is it that the Pipeline data are not forward-filled? Or are the nans errors in the data?

Meanwhile run this and look at the logging window. All zero for fcf for example. One way to address that is a longer window in the factor (maybe 5 or 22 would do, or 63+ to cover a quarter), forward fill and return the last,

    class fcf(CustomFactor):  
        inputs = [Fundamentals.fcf_yield]  
        window_length = 88    # 1  
        def compute(self, today, assets, out, fcf_yield):  
            fcf_yield = nanfill(fcf_yield) # seems fine even without this, given longer window vs 1, why?  
            out[:] = preprocess(np.nan_to_num(fcf_yield[-1]))  # [-1] added  

Thanks Blue - I hope to get back to this. Will post an update at some point with a list of open issues and questions.