Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Comparing Data Sets

I'm currently attempting to implement an algorithm that I optimized in Matlab, using minute pricing data I purchased from QuantQuote. I can't seem to match results, not even close. Both QuantQuote and Quantopian's data is adjusted for splits and dividends, but there is a major difference in price. Regardless, the problem I'm having, if you pull the data from two dates at the same time, my matlab simulation always appears to make a different profit than the Quantopian data.

For example for XOM, (minute based close data)

QuantQuote:
20110210 930 - 68.58
20110211 930 - 69.184
profit/share - .604

Just for reference, give than the first data set in Quantopian starts at 931

20110210 931 - 68.7478
20110211 931 -69.3182
profit/share - .5704

Quantopian

2011-02-10 14:31:00+00:00 - 81.75
2011-02-11 14:31:00+00:00 - 82.47
profit/share - .72
(There appears to be a time zone error, I'm just using the first line)

When implementing my algorithm on each data set, the QuantQuote data yields on average double the profit from a simple 24 hour buy/sell. Given that the profits are small, the errors build up quickly.
(No prices are adjusted for slippage in either algorithm)

Why isn't the raw data matching?

3 responses

Check out our FAQ about the data between Quantopian/Google/Yahoo/other vendors: https://www.quantopian.com/faq#data

In particular,

Why is your close price different from other data sources?

Quantopian uses the last traded price as the close price for the security. Depending on the data source, others may use end-of-day (EOD) prices. For example, Yahoo is an EOD datasource. Yahoo and other EOD data providers get their price and volume data from the official exchange record. Quantopian's data is generated by the actual trades, regardless of what exchange the trade was made on. The EOD sources rarely exactly match data derived from intraday data. For instance, the official close for a NYSE stock is the last trade of the day for the stock on NYSE. But if the stock also trades on Chicago, Pacific or another regional exchange, the last trade on one of those exchanges could be our close.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Alisa.
Thanks for the reply. Did you see I'm comparing minute pricing data I purchased from QuantQuote?

James, the pricing data that you get on Quantopian is always split adjusted back from a particular reference date. In backtesting and pipeline, prices are always adjusted as of the simulation date. If you get a price using get_pricing in research, it will be adjusted as of the end_date of your lookup.

In practice, this means that you can get a different price for XOM depending on the reference/knowledge date when you ask for it. Typically, we've seen that other data sources adjust their data as of the current date. In backtesting, this introduces forward lookahead bias so we always make the adjustment as of the current simulation date.

I've attached a notebook to help demonstrate what I mean with an example.

Let me know if this helps.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.