There are several sources of price differences that are "normal." Then there are data errors, and then there are bugs. I'm sure we have some from each category in Quantopian.
Normal:
- Yahoo, among others, adjusts prices using a different methodology. In particular they treat dividends differently. It's not that they are wrong; their method is perfectly acceptable. It's not well suited to backtesting, though, and that's what we've been optimizing for.
- Quantopian's data source is constructed from the as-traded, intraday trade feed. Yahoo and others use what's called an end-of-day source. I wrote about it recently: "Yahoo is an 'end-of-day' (EOD) datasource. Yahoo and other EOD data providers get their price and volume data from the official exchange record. Quantopian's data is generated by the actual trades, regardless of what exchange the trade was made on. The EOD sources rarely exactly match data derived from intraday data. For instance, the official close for a NYSE stock is the last trade of the day for the stock on NYSE. But if the stock also trades on Chicago, Pacific or another regional exchange, the last trade on one of those exchanges could be our close."
- Especially with some older OHLCV data, there is no good record! It's amazing the data that these companies threw out back in the day. There's a guy who built a business on the fact that he saved every CD that his data provider sent him every month. They deleted it all, he kept it. And then he sold his collection back to them! So depending on the source, some of the older data has different sources.
It's hard to know what scenario you're running into without studying. On the first one, you'll see older prices being very different, and the difference narrows as you get closer to present. On the second one, the prices will be off by a few pennies each day, moreso with thinly traded/big spread stocks. One the third one, it's hard to identify because the source of truth is obscured.
So, which flavor are you running into here? I don't know off the top of my head. Based on the 5 data points there I'd say it's probably door #2. The price differences look like .08, .23, 0, 0, and -.05. That's "not much" of a discrepancy, and doesn't have a pattern to it.
I'm always on the lookout for data problems. Obviously, the bigger they are the more time we can invest in tracking them down. At the moment, this one feels like it's in the noise.
Disclaimer
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.