Beginner questions pipeline.

Back to Community

posted Aug 13, 2018

Hi everyone, I'm a newbie on quantopian. My goal here is to do an algorithm to have a list of interesting companies. I recently made my first algorithm using "morningstar.operation_ratios.roa.latest" to get the roa of all the companies.

Now I want to use these values, like for exemple get the value and the companie with the highest or lowest roa. But this "orningstar.operation_ratios.roa.latest" thing has a wierd type.

Is there a way of making a list or array with this values or something like that ?

5 responses

Dan Whitnable

Aug 13, 2018

First off welcome!

The "morningstar.operation_ratios.roa" thing is what's called a "bound column" object. It handles the connection to the correct field in the data database. It's not the data itself just the connection info. The "latest" method gets the latest data from the database. Again, it's not the actual data. Maybe read through the docs (https://www.quantopian.com/help#base-classes) and maybe this post for some insight (https://www.quantopian.com/posts/need-help-with-pipeline-just-a-beginner )

By adding factors to the pipeline one is simply defining the columns one wants returned when the pipeline is run. To get a list (or more precisely a dataframe) of the data, simply run the pipeline using the "run_pipline" method. The result is the data. The notebook above is correct. The only reason you probably didn't see any output is that the markets weren't open on the date you chose (2017-01-01) . Try 2017-01-03 and you will see some data. The index contains all the securities.

Good luck.

Diego Sancho

Aug 14, 2018

Hi, thank you for your answer ! :)

Actually the notebook above "works" but it gives me the roa and roe of all the companies in the database (which is not very interesting). I've put only one day to have les companies. My goal for this is tu give a grade to the companies for exemple from 1 to 5 based on different criteria. I started with ROA and ROE then I'll be puting other things.

That's why I wanted like a list or something so a can use the actual data. If I got it right you say once I run the pipeline I can start using the actual data ?

I've put the notebook once again so you can see what it returns even though I can imagine you have an idea of what it returns. (I didn't know I had to run the notebook before I attach it so you can see the results)

Thank you !

Dan Whitnable

Aug 14, 2018

A couple of things. The results returned from a pipeline (ie the dataframe) are the same whether run in a notebook or in an algorithm except that an algorithm returns a single days worth of data (ie the current simulation day) and a notebook can return many days (hence the multi index with the date). Several times above it's mentioned 'I wanted like a list or something so a can use the actual data' . The returned data is in a pandas dataframe which is MUCH more powerful than a simple python list (once one starts using pandas you'll never go back to vanilla python again). Do take time to explore and use pandas.

While the basic function of a pipeline is to fetch data, as you noted that's not that interesting. One needs to do something with that data to (hopefully) predict successful trades. There are a number of filters and methods one can apply to the "factors" (ie the columns defined in a pipeline) within the pipeline definition. (see the Quantopian docs https://www.quantopian.com/help#pipeline-title) One can get the largest, smallest, perform basic math, rank etc all within the pipeline so pre-calculated data is returned in the dataframe columns. One can also define filters within the pipeline so only specific securities are returned in the rows of the dataframe.

However, another approach is to think of the pipeline as simply a data fetching tool. With this approach, one could do the data manipulation with pandas (or other methods) outside of the pipeline definition once the pipeline is run. This is certainly the most flexible and gives access to a lot more methods (though there are a few pipeline factor methods such as z-score which are easier to do in a pipeline definition.) Typically a hybrid is what one ends up with.

If you haven't already. I strongly encourage you to look at the Quantopian tutorials (https://www.quantopian.com/tutorials/algorithmic-trading-sentdex). If you are looking for ideas on how to 'grade' companies based upon a combined score, you may want look at tutorial 11 for ideas. (https://www.quantopian.com/tutorials/algorithmic-trading-sentdex#lesson11)

Dan Whitnable

Aug 14, 2018

Attached is the same notebook with some added examples of how the returned dataframe can be manipulated to find specific stocks and return a narrowed down list.

Diego Sancho

Aug 14, 2018

Awesome !!! Thanks a lot !

This was reaaally useful I'm gonna take a look to all of this.

Thanks again :)

You've successfully submitted a support ticket.

Our support team will be in touch soon.