Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
PipeLine and getPricing price shift ?

Hello Quantopians,
I am completely new in this field either in trading and python programing :)

I am facing one problem with getting different data from Pipeline and from get_pricing method. It seems that pipeline data is shifted one day later.

Please can you give me some advise how to solve this?

Please see notebook attached (Ticker 'AA' for example)

Thanks a lot in advance

7 responses

The one day offset is there because Pipeline values are computed before trading start each day. So the close price you get in Pipeline is the last known price before trading start, that is yesterday close. This behavior mimics the values you would get from Pipeline in an algorithm. That is not the case for get_pricingthat reports the price for that exact day instead.

This is not the only difference between Pipeline and get_pricing, the merges/splits are treated differently too

Have a look here https://www.quantopian.com/posts/pipeline-frequency-in-research

To add a bit to what @Luca posted.

There really isn't a 'shift' in the data but rather a different interpretation of the dates when using the get_pricing and the run_pipeline methods. The dates specified in the 'get_pricing' method are the actual dates one wants data for (as one would expect). However, the dates specified in the 'run_pipeline' method are the dates on which to run the pipeline. By definition, pipelines fetch previous days data and only fetch daily data. This makes sense if one considers an algorithm running a pipeline on day N. One can't know the current day close, high, low, or volume. All one can know is yesterdays (N-1) data. Specifying the pipeline 'run' date therefore, always returns data from the previous trading day.

Also, note @Luca's comment about how the data returned by a pipeline is adjusted differently for splits and dividends compared to the 'get_pricing' method. That's important and one reason one may want to choose one method over another. Actually, one would typically always want to use the 'get_pricing' method for price and volume data unless there is a specific reason to use the un-adjusted prices returned by pipeline.

Luca, Dan thank you for you response. I kind of know what you are trying to tell me....

But what I dont understand is :
scenario:
Today morning is 29-06-2018 and my algo runs "before_trading_start" and returns Pipeline results. So it only knows data from 28-06-2018 - what makes sense. Why is returning row with 29-06-2018 and shits data to this date?
What i want from Pipeline is data until 28-06-2018 close with right index.

May be its in research env. and in IDE would be another result based on dates provided to algo..

Thank you

The reason pipeline is returning rows beginning with 29-06-2018 is again related to the definition of dates when using pipeline. Think of the dates as 'when the pipeline was run'. Don't think of pipeline dates as the price or volume on that date.

This makes complete sense because one can have many types of factors returned by a pipeline, not just simple price and volume data. For instance, one could have a custom factor that returns prices as of 10 days ago, or a factor that returns the volatility over the past 6 months. For this reason, it's more consistent to consider the dates as 'when the pipeline is run on this date these are the values of the factors it would return'.

For this reason, there are times when it makes sense to use data from 'run_pipeline' and other times when one should use 'get_pricing'. If one wants to analyze a strategy and needs to know what data a strategy would see on any given day, then use pipeline data. It won't be biased with any forward looking stock splits etc. However, if one wants to analyze security trends then use 'get_pricing'. Pipelines are strategy oriented / 'get_pricing' is security oriented. Consider the pipeline dates as dates a strategy is evaluated and the factor data as of that date. Consider 'get_pricing' as dates for a security and price and volume data as of that date.

Hope that doesn't just confuse things?

Dan, thank you sooo much !!!
Now I understand it perfectly.....

Did some exploration and now it is clear !!

Once again thank you very much...
Jan.

@Jan Falta, Glad to help. Good luck!

Thanks for posting, this helped a lot!