Beginner trying to ignore NaN type missing data in the pipeline

Quantopian's community platform is shutting down. Please read this post for more information and download your code.

Back to Community

posted Jan 23, 2017

I've been trying to create some filters in the pipeline with fundamental data ratios. However, I want to use adjusted ratios. When I apply a filter, the data frame automatically invalidates all calculated fields if there is at least one NaN in the calculation. I was wondering whether there is a way to replace NaN fields with zero so that the calculated fields are no longer invalid? This would allow me to use them later in the pipeline as a filter.

As an example I attach a notebook where the calculated field NWC is invalidated in cases when unearned revenue is missing.

Any suggestions are very appreciated!
Rob

5 responses

Stephen Kearney

Jan 23, 2017

df.fillna(0)

See:
http://chrisalbon.com/python/pandas_missing_data.html

Rob F

Jan 23, 2017

Thanks Stephen for the tip, however I can do this only once I run the pipeline, suggesting that I would have to calculate NWC again after I run the pipeline. What if I wanted to do additional calculations with NWC before I run the pipeline? If I used .fillna(0) I would have to calculate NWC again after I run the pipeline and filter multiple times. It would be easier if I could just do all this before I run the pipeline.

Stephen Kearney

Jan 24, 2017

Hi Rob,

Try changing the next to last line to this:

result_df = run_pipeline(mypipeline(), '2016-05-05', '2016-05-05').fillna(0)

Rob F

Jan 24, 2017

good idea but unfortunately I'm running into the same issue. So far I had the most success with attaching .isnan() to the column. This however returns boolean and I'm having trouble specifying that I want 0 when true.

I'd appreciate a little more guidance.

Thanks
Rob

Stephen Kearney

Jan 24, 2017

Here's my copy of your notebook with the fillna(0) change. It's returning zeros for the cells that were NaN in your first post.

The return object of run_pipeline() is a DataFrame. So the .fillna(0) I added changes the NaNs to zero in that returned DataFrame.

You've successfully submitted a support ticket.

Our support team will be in touch soon.

Need help? Contact support.

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian.

In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Careers

Events

Status

Twitter

YouTube