Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Alphalens - Dealing with stocks without previous pricing data

I'm new to Quantopian and I am trying out alphalens but i am running into an error on cell 3. I think it has something to do with the fact I do not have a complete price history. Does anyone know of anyways around this problem?

(Is there anyway of editing out assets without complete pricing history from pipeline?)

2 responses

The clue here is the error message "Dropped 99.2% entries from factor data: 3.2% in forward returns computation and 96.0% in binning phase". 96% of the values were dropped in the 'binning phase'.

What does that mean? The default get_clean_factor_and_forward_returns behavior is to attempt to create 5 quantile buckets (or bins) each day each with an equal number of values. One common problem is when a factor has a lot of results with a single value. In this case, there are a lot zeros. The binning places all the zeros into a single bin, but since there are so many zeros, there aren't enough other values to create equal size bins. The binning fails and generates an error message similar to the one above.

So, one of the first things I typically do when using Alphalens, and analyzing factors in general, is to look at the distribution of factor values. First, ensure the data is meaningful. The BIGGEST issues with factors is often having a lot of meaningless data which obscures the good data. In this case there are a lot of zeros which don't add much information. Second, ensure the data is rather nicely distributed. That's a bit arbitrary but if values are 'bunched' up with a few stragglers at the extremes, maybe one would want to filter out the extremes? The couple of methods I use for a quick look at a factor are these:

# Plot a histogram of the factor values - looking for a somewhat uniform distribution  
factor_data.my_factor.hist()

# List the quantity of each value - ensure there is not too many of a single value  
factor_data.my_factor.value_counts()  

Another approach is to not use the Alphalens defaults of quantiles=5, bins=None. Turn it around and use quantiles=None, bins=5. That will try to make 'equal width' bins and not 'equal quantity' bins. The binning won't typically fail this way. When doing this, take note of the bins it produced and perhaps filter based upon those bins. The get_clean_factor_and_forward_returns method would look like this

merged_data = get_clean_factor_and_forward_returns(  
    factor=factor_data.my_factor,  
    prices=pricing_data,  
    periods=range(1,252,10),  
    bins=5,  
    quantiles=None  
)

So, in this particular case, either drop all the zeros or use fixed bin sizes, and you'll eliminate the error. Dropping all the zeroes can be done by filtering the original pipeline or afterward by dropping those rows.

I've made some of these changes to the notebook to see how it works.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Nice I will try that out. Thanks for the great response!