Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Problem with pipeline

Hi,

I have an issue with the output of my pipeline. I'm trying to create a factor and put it in the output of my pipeline but get an error that I can't seem to figure out:

zipline.pipeline.pipeline.validate_column() expected a value of type zipline.pipeline.term.Term for argument 'term', but got int instead.

Thanks so much in advance.

Cheers,
Matt

2 responses

Good question Mathieu!

One can't really use 'if-then' statements within a pipeline definition. Remember that the make_pipelinecode is really only run once to define the pipeline. This is before the pipeline has actually fetched any data. Conditional logic like this doesn't work in this context. Another way to think of it is factors such as 'eps' represent dataset objects and NOT really the actual data and therefore cannot be compared using 'if-then' logic.

So, how to construct a score based upon a number of factors with simple logic? A typical scoring mechanism is to take z-scores of each factor (to normalize the values) and then sum these up. This can be done inside pipeline with the methods and operators defined for pipeline datasets. However, a scoring system which simply adds and subtracts from a score based upon logical conditions is more problematic. Perhaps the cleanest approach is to output all the data factors and then manipulate the resulting dataframe. Generate a score in logic once the dataframe is run.

See the attached notebook for a possible solution.

If anyone has other approaches to generate composite scores it would be great hearing from you. This does come up from time to time.

Hope this helps.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi Dan,

Thanks so much for the detailed answer and for taking the time to reply.

I was so focused on my pipeline that I didn't even think about adding a score column to the DataFrame. Making all the calculations once the pipeline has been run make it so much easier.

I'm gonna start working that right away.

Thank you very much again!!

Cheers,
Mathieu