Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
HELP! run_pipeline() module name in IDE

I am a newbie to quantopian and differences between modules in IDE and research is bogging me down.Specific issue on which I am stuck is regarding run_pipeline function()
run_pipeline() has module quantopian.research according to tutorial but it seams it is only available in research not IDE.

Can somebody point me to the correct module for doing this?

1 response

A pipeline is run a bit differently in the IDE (backtest) environment verses the notebook (research) environment. Instantiating, adding columns and adding screens are the same in both. However, the difference comes in how it's actually run (and the subsequent output retrieved).

The notebook (research) environment uses the 'Pipeline' class definition to create a pipeline and then uses the 'run_pipeline' method to run it. Therefore the following imports are needed.

# imports for notebook research environment  
from quantopian.pipeline import Pipeline  
from quantopian.research import run_pipeline

The IDE (backtest) environment uses the same 'Pipeline' class definition to create a pipeline but then uses the 'pipeline_output' method to run it. Algorithms also require one extra step to 'attach' or associate a pipeline with an algorithm so the 'attach_pipeline' method is also required. Therefore the following imports are needed.

# imports for IDE algorithm backtest environment  
from quantopian.pipeline import Pipeline  
from quantopian.algorithm import attach_pipeline, pipeline_output

Take a look at the documentation for info on each of these.
https://www.quantopian.com/help#quantopian_pipeline_Pipeline
https://www.quantopian.com/help#quantopian_research_run_pipeline
https://www.quantopian.com/help#initializing-a-pipeline
https://www.quantopian.com/help#using-results

The are several reasons for the difference. First, is that in the IDE (backtest) side of things, exactly one days worth of data is returned (in the form of a pandas dataframe indexed by security). In the notebook (research) side, one can ask for multiple days worth of data (returned in the form of a pandas dataframe having a multi-index of security AND date). Secondly, on the IDE (backtest) side, a pipeline runs pseudo asynchronously (ie retrieves and calculates multiple days data with a single call). The 'attach_pipeline' and 'pipeline_output' methods work together to store those pre-calculated days of data and return them individually to the backtest as needed.