Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
public access to Quantopian data?

Perhaps you consider it proprietary, but could you provide details on the source of your backtest data? Are you buying it? From whom? Do you consider it open-source/public domain?

Have you considered posting the data to a public archive for download (with quarterly updates, for example)? Or does your vendor (presumably you have one) apply restrictions to the distribution?

I suppose this gets back to the question of what business do you want to be in? I think supplying the data (either raw or reduced by you) free to the worldwide public would be a nice service, and you could still offer access to it with your in-browser backtesting software. The challenge then would be to develop a superior in-browser online user experience, relative to what could be done on a stand-alone workstation.

11 responses

As a side comment, direct access to the data could address the concern of some users with regard to security of their intellectual property. If your backtesting software will be open-source, then presumably a sophisticated user could figure out how to run it on his hardware, right? All he'd be missing would be the data and a means to place orders (which you could facilitate, as well). Again, you could still develop and offer the on-line, in-browser interface, live trading, etc.

Hi Grant,

We're buying our data from a 3rd party. The data itself is definitely not open domain. We've secured the right to use it, and for our members to use it in testing, but we don't have the right to redistribute it. The cost of per-minute bar data is very high - we could never afford the fees to redistribute it. The backtester is something we own, and that gives us the power to open source it. The data we do not.

In our long-term vision, we plan on making it possible (even easy) to add new data sources. Some of those data sources might be paid, some of them free, some open sourced, etc. Maybe someday there will be an open-sourced version of the tick data - that would be totally awesome and we'd love to support it. I suspect what is more likely is we'll see dozens of non-tick data sources, from social media sentiment to economic metrics to weather reports.

All that said, I don't know of anywhere else where you can find free access to by-minute data and free backtesting. I'm hopeful that is an attractive package that traders will want to take advantage of.

Dan

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hey,

wow you have open sourced the backtester. I would appreciate if I could have a look at the code. Is there a public repo or sth.?

Thx in advance for your answer!!!

Max,

Yes, you are more than welcome to the code: https://github.com/quantopian/zipline

thanks,
fawce

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi Dan,

Thanks for the reply and we understand the issues with redistribution. While the data is not open domain, there should be more transparency as to where the data is coming from? What are your sources for the tick level data, and what level of post-processing of this data (if any) does your company do? What are the validation techniques for any post-processing that you perform? These are important questions that I would think any trader would need to validate before fully trusting a new site with their money. For example, Portfolio123 (another site I use) is very transparent about their data sources and any post-processing of data that they do. I'd love to see the same level of information from you guys. Great work thus far!

Cheers,
Ragu

Hello Ragu,

I sort of agree, but also appreciate that Quantopian may have proprietary information and a vendor relationship that they need to protect. That said, a bit more information regarding the accuracy of the data set in the context of backtesting would be appropriate.

Generally, I've been curious about why the cost for minutely trade data is so high. From a couple sources, I've heard various explanations, but given that Quantopian is now able to provide daily updates, I suspect that the data reduction is fully automated...I kinda doubt that there is a room full of analysts working overnight to crunch the numbers. So, in the end, my sense is that market data should be an inexpensive commodity, unless there is a monopoly or price fixing.

Grant

Hey Dan, sorry for the necro on this thread but I'm new and curious if you've looked into accessing Canadian exchange data?

We remain very focused on live trading with US equities. Once we have that product in general availability, then we'll look at adding more data sources. I'm inclined to think we'll do things like fundamental data before we do non-US securities. I think it will be a while before we get to Canadian exchanges, unfortunately.

Make sure you look at Fetcher. It might get you what you're looking for using external data sources.

Hi,

I am trying to get most out of Quantopian for devising a strategy for my academic research and I want to know the fields of Quantopian dataset. Does it use Trade & Quote data, like offer or bid prices
?

Hi Sam,

For details take a look on the FAQ under Data Sources. Briefly, we give you access to 12 years of minute bar trade level data (not bid/ask or quote data) for US Equities. If you have other time series data source you'd like to combine with our market data you can check out the Fetcher method which allows users to pull in csv data via http if it adheres to a set format.

Hope this is helpful for you - one other alternative if you have access to your own academic data source and you're just looking to use the backtester would be to check out our open source project Zipline. Though I don't think it would be trivial to re-jigger it to do simulation correctly off of bid/ask data - but others might have more insight there.

Best wishes, Jess

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Is this the data you want? http://www.backtestdata.com/order