Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Beginner trying to ignore NaN type missing data in the pipeline

I've been trying to create some filters in the pipeline with fundamental data ratios. However, I want to use adjusted ratios. When I apply a filter, the data frame automatically invalidates all calculated fields if there is at least one NaN in the calculation. I was wondering whether there is a way to replace NaN fields with zero so that the calculated fields are no longer invalid? This would allow me to use them later in the pipeline as a filter.

As an example I attach a notebook where the calculated field NWC is invalidated in cases when unearned revenue is missing.

Any suggestions are very appreciated!
Rob

5 responses

Thanks Stephen for the tip, however I can do this only once I run the pipeline, suggesting that I would have to calculate NWC again after I run the pipeline. What if I wanted to do additional calculations with NWC before I run the pipeline? If I used .fillna(0) I would have to calculate NWC again after I run the pipeline and filter multiple times. It would be easier if I could just do all this before I run the pipeline.

Hi Rob,

Try changing the next to last line to this:

result_df = run_pipeline(mypipeline(), '2016-05-05', '2016-05-05').fillna(0)

good idea but unfortunately I'm running into the same issue. So far I had the most success with attaching .isnan() to the column. This however returns boolean and I'm having trouble specifying that I want 0 when true.

I'd appreciate a little more guidance.

Thanks
Rob

Here's my copy of your notebook with the fillna(0) change. It's returning zeros for the cells that were NaN in your first post.

The return object of run_pipeline() is a DataFrame. So the .fillna(0) I added changes the NaNs to zero in that returned DataFrame.