Factor Tear Sheet

Mar 9, 2016

Great work.
Hopefully Q can post a master thread that incorporates the entire workflow of factor brainstorm all the way to this tearsheet

Mar 9, 2016

Any conclusions? For the securities analyzed, is the factor good/bad/can't tell? Why?

One thought is that it would be nice to run it on a known-good factor and a known-bad factor, for comparison and testing the tool. Or maybe the one you selected is known-good?

Mar 10, 2016

To see how the tool works, would it be feasible to gin up some data for pipeline that would result in the good/bad/ugly cases? In other words, simulate the inputs, to test the tool (since at this point, we may not have any known good/bad/ugly examples). It's kinda like if we were to design a new thermometer. We'd stick it in some ice water to see if we got 0 C, and some boiling water for 100 C (and maybe dry ice and liquid nitrogen, too). And then proceed to measure unknown temperatures.

Within the research platform, is it possible to simulate the input to pipeline? Is it a matter of replacing:

from quantopian.pipeline.data import morningstar  
from quantopian.pipeline.data.builtin import USEquityPricing

with some blocks of code?

Grant,

Let me clarify, you want to create a fake factor dataset and two fake price datasets.

The factor is highly predictive for one dataset and not predictive at all for the other dataset.

Run 2 factor tear sheet simulations to see the difference in results.

@ Miles,

That's the concept. Rather than "fake" I'd say "simulated" data, but it is a matter of word choice (sometimes, I think the term "toy data set" is used). Andrew is looking at free cash flow yield and EBITA/EV, to see if they have any predictive value as factors (the null hypothesis is that they are useless, and the alternative hypothesis is that they provide useful signals). I'm not sure yet, but I think the answer is "can't tell" (cannot reject the null hypothesis) but I don't yet understand the tool well enough. What is your conclusion and why?

I'm not yet clear if this can be done in the research platform. Rather than bog down this thread with the nitty gritty details, I've posted the question separately: https://www.quantopian.com/posts/possible-to-simulate-inputs-to-pipeline-in-the-research-platform .

Deleted User

@Grant

Most of the plotting functions in the factor tear sheet take a DataFrame with date, equity, sector_code, factor_value, and forward_price_movement columns. You could either build a fake pipeline output DataFrame in this form from scratch or generate a pipeline output using the construct_factor_history and add_forward_price_movement functions and replace the values in the factor and forward_price_movement columns as you see fit. Putting random floats into each column would simulate a bad factor. Identical factor and forward_price_movement columns would yield a perfectly predictive factor.

Disclaimer

Thanks Andrew. Modifying the pipeline output might be the way to go. --Grant

[EDIT:] I'd like to be able to modify/simulate the input data to pipeline. Is there any way to do this?

Mar 24, 2016

Andrew,

Are you going to update this module now that open prices are available in the pipeline?

I have built a backtesting framework similar to yours and it takes alittle brain work to make sure that your data lines up.

I would be happy to assist.

My framework has noticable speed improvements. I am not sure if updating the module to get returns from the backtester is necessary.

Deleted User

Mar 25, 2016

Miles,

This tearsheet is built to accept any Pipeline factor as an input. The analysis uses get_pricing to pull in the data needed to compute forward price movements. Are you suggesting using open prices in the calculation of forward price movement?

Is your framework in an IPython notebook? If you are willing to share it, I'd love to take a look.

Disclaimer