Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Help on figuring out a NaN issue

Hello,

I'm trying to replicate the results from "The Little Book that Still Beats the Market". The strategy is simple in that it takes rankings from earnings_yield and return on invested capital and buys and holds the best ranked stocks for 1 year. I am ranking the earnings_yield from lowest to highest and assigning it a ranking score. I'm doing the same for the ROIC as well. Once two sets of rankings are complete I add the ranking scores together and assign a final rank of the original pandas dataframe. I then order it in descending order and pick the top 60 stocks to hold.

I am having issues with the NaN. Some symbols have "NaN" and are messing up my rankings overall. Any ideas in troubleshooting the NaN's as I would like to completely remove those tickers from all ranking lists.

Also any help in filtering stocks with certain marketcap will be great as well!

3 responses

dropna() is one route.

This shows the number of nans for each of the factors, abbreviated:

                   min              mean                max  
   mkt     312392342.0     7774684074.94     311065842800.0  
   pbr          0.1515          5.603815           769.2308     NaNs 11/1553  
   per          1.0059         65.838372            10000.0     NaNs 58/1553  
  roic       -9.660113          0.010075           0.981729     NaNs 6/1553  
   scr             nan               nan                nan     NaNs 1553/1553     # value_score  
   yld          -0.758          0.027015             0.9941  

Using dropna() only, returns are around 1/3 for that time frame, 2004.

It's not uncommon.
Nans produce unpredictable, false results and often people don't realize it is happening. Good for you. You noticed.

Another thing to know, after clone, click a line number in the margin to set a breakpoint somewhere and run it, can be useful, just starts slower.

This will hopefully give you more avenues to work with and help bring back returns but without the nans plus lots of flexibility.
A run of just a few weeks, less to wait for here. Good luck.

Hello,

Thank you for the respone. I followed your dropna() suggestion and it worked with a few errors. However the best solution which I've found so far is to screen through the pipeline initially for any nans. screen = criteria.notnan()

Thanks,
Roo

attached for around 17%.

The algorithm was trying to replicate results from "The Little Book that Beats the Stock Market" formula.