Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Pipeline values don't match history()

Attached is a simple algo that outputs a few values of USEquityPricing.low from the Pipeline, and also outputs history(..., "low"). If you run this in daily mode from 3/1/2016 to 3/2/2016, it shows (excerpted):

2016-03-01 PRINT today: 2016-03-01 00:00:00+00:00  
2016-03-01 PRINT first asset id (should be 2 for AA): 2  
2016-03-01 PRINT lows for first asset: [ 8.51  8.75  8.87  8.86  8.94]  
...
2016-03-01 handle_data:31 INFO 2016-02-24 00:00:00+00:00    8.25  
2016-02-25 00:00:00+00:00    8.55  
2016-02-26 00:00:00+00:00    8.82  
2016-02-29 00:00:00+00:00    8.85  
2016-03-01 00:00:00+00:00    8.92  
Name: Equity(2 [AA]), dtype: float64  
...

So the Pipeline says that 5 days of lows for AA is:
8.51 8.75 8.87 8.86 8.94
history() says that 5 days of lows for AA is:
8.25 8.55 8.82 8.85 8.92

The history() data matches Yahoo Finance. The Pipeline data looks more like the close prices offset by 1 day and a penny or so.

Is this expected? This seems somewhat disconcerting.

Thanks.

10 responses

Pipeline data is adjusted for both split and dividends, take a look at this explanation: https://www.quantopian.com/posts/the-pipeline-api-dividends-and-splits-what-you-need-to-know

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks for the reply, but I don't see how that can account for the difference because:

  1. AA has not had any stock splits around February/March.
  2. AA did go ex-dividend on Feb 3, payable Feb 25, but the amount was $0.03 which is much smaller than the differences between the data: 0.26 0.20 0.05 0.01 0.02. In other words, $0.20/$0.26 is much larger than $0.03.

Am I misunderstanding?

Thanks.

I think the problem is actually that your days are offset by one in the output. They're labeled properly, but it's not necessarily obvious that they are offset. I cloned your algo and ran it. Full log output here, slightly reformated:

2016-03-01PRINTtoday: 2016-03-01 00:00:00+00:00  
2016-03-01PRINTfirst asset id (should be 2 for AA): 2  
2016-03-01PRINTlows for first asset: [ 8.51  8.75  8.87  8.86  8.94]  
2016-03-01PRINTtoday: 2016-03-02 00:00:00+00:00  
2016-03-01PRINTfirst asset id (should be 2 for AA): 2  
2016-03-01PRINTlows for first asset: [ 8.75  8.87  8.86  8.94  9.09]

2016-03-01handle_data:31INFO  
2016-02-24 00:00:00+00:00    8.25  
2016-02-25 00:00:00+00:00    8.55  
2016-02-26 00:00:00+00:00    8.82  
2016-02-29 00:00:00+00:00    8.85  
2016-03-01 00:00:00+00:00    8.92  
Name: Equity(2 [AA]), dtype: float64

2016-03-02handle_data:31INFO  
2016-02-25 00:00:00+00:00    8.55  
2016-02-26 00:00:00+00:00    8.82  
2016-02-29 00:00:00+00:00    8.85  
2016-03-01 00:00:00+00:00    8.92  
2016-03-02 00:00:00+00:00    9.10  
Name: Equity(2 [AA]), dtype: float64  
End of logs.

If you look at the first batch, the first date is 2/24. The 2nd batch starts a day later, on 2/25. When you re-align the dates, the 4 dates that exist in both sets do have matching low prices.

What's going on is that you are running your algorithm in daily mode. The pipeline runs at the start of the day, before the open. history() is running at the end of the day. That means that history() has one more day of data to work with.

As for being off from Yahoo by a penny or some other small amount - that is not uncommon. Check out the explanation we put in the FAQ about as-traded price bars v. EOD price bars.

I can't replicate the data you put in your example, so there may be a third factor at work that I don't see yet.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks for the reply. (And thanks for the many videos on your YouTube channel!)

Ok, if history() effectively has one more day to work with, then the proper comparison to do is:

Pipeline:     8.51  8.75  8.87  8.86  8.94  (missing one day)  
history():          8.25  8.55  8.82  8.85  8.92  
difference:         0.50  0.32  0.04  0.09  

These sets comparing Pipeline to history() do not 'have matching low prices', or do I misunderstand what you meant?

As to the FAQ, being off by $0.50 when the stock is ~$8 seems like quite a bit? This is expected due to the EOD etc. methodology?

I'm not sure what you mean by "can't replicate the data..." since your output replicated exactly what I'm seeing.

Thanks.

Somehow, we're not looking at the same thing. If you look at my output that I pasted above, it's different from the output you're quoting. I'm attaching my most recent backtest so we can compare apples to apples.

$.50 is far more than I'd expect for an EOD pricing issue. We're still looking at an day-is-off-by-one issue, I believe.

I'm using the same source code as you. The reason the output that I originally quoted is different from yours is being I only pasted an excerpt of my full output (so as to not obscure the real issue). My full output matches your full output. Alternatively, if I excerpt your output, it is the same as what I originally quoted.

Anyway, I'm comparing the first 3 lines of output (which comes from LowFactor.compute()) to the first 3 lines of handle_data() output. And my last post took into account the day off-by-one by shifting the Pipeline data back one day. Let me know if this isn't clear.

Can you post a screenshot of what you're seeing? Or email me at [email protected]? I suspect there is a copy-and-paste error going on; the alternative is you are seeing something totally different. That would be surprising, but bugs usually are. . . .

1) Click clone on my backtest above
2) Run a full backtest
3) When I do that, the first 6 log lines look like this:

2016-03-01PRINTtoday: 2016-03-01 00:00:00+00:00  
2016-03-01PRINTfirst asset id (should be 2 for AA): 2  
2016-03-01PRINTlows for first asset: [ 8.51  8.75  8.87  8.86  8.94]  
2016-03-01PRINTtoday: 2016-03-02 00:00:00+00:00  
2016-03-01PRINTfirst asset id (should be 2 for AA): 2  
2016-03-01PRINTlows for first asset: [ 8.75  8.87  8.86  8.94  9.09]  

Line 3 and Line 6 are showing the same data, offset by one.

 [ 8.51  8.75  8.87  8.86  8.94    x ]  
 [   x   8.75  8.87  8.86  8.94  9.09]  

Here is an annotated screenshot.

The issue is comparing line 3 (which is from the Pipeline) to lines 7-11 (which is from history()). Offset-by-one, they are:

[ 8.51  8.75  8.87  8.86  8.94     x ]
[    x  8.25  8.55  8.82  8.85  8.92 ]

Thanks.

OK. I finally get what you're trying to tell me. I apologize for being dense - I was jumping to the easy solution, and you've found something a lot deeper. I will dig into it and get back to you.

Preliminary, high and close appear to be right, but low is not. We have a known issue with open.

Thanks for continuing to look into this.