Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Quantopian as R&D environment?

Thought I'd throw out the topic of getting Quantopian moving toward filling out R&D capabilities, so that users could do screening, data exploration, algorithm development (sorry, but this ain't a development platform), etc. This post is more directed toward the Quantopian principals, versus customer support. Short of taking up a monetary collection, is there anything users could contribute along these lines, to help map out a potential future?

I realize that you are likely super-focused on going live with IB, but if R&D is still on your radar screen, the timing might be right to start discussing it.

Personally, not having any trading experience or other offline tools, I don't see a path using Quantopian alone to develop algorithms that are optimized and that I understand in-depth.

Grant

12 responses

i just joined the quantopian community a week ago (i saw the article on the live trading on hacker news) and for what it's worth, the reason I joined is the R&D tooling. For me, it hits a sweet spot between easy to get running and r&d features/visualization. Though I'm personally not a fan of Python but it's not too horrible, and there are certainly R&D timesavers that could be implemented so I don't have to build/backtest again and again.

Hello Jason,

As a convenient backtester for trying out back-of-the-envelope ideas, it is great. However, the way I think about it is that there are institutional investors with better data than Quantopian can provide, lots a crackerjack analysts and programmers, and big 'ol computers. So, Quantopian has a ways to go before providing an end-to-end product that would give retail traders with a competitive edge.

Grant

if you are wanting to do retail trading, probably the biggest feature quantopian could add is increased flexibility of external data feeds. forcing the data to be in csv format is quite limiting, (as it would require you to program/deploy/run a server to convert your external realtime feeds to CSV) plus I don't know if you can invoke the external feed requests during the handle_data() loop, but if you can't thats a total show stopper.

Otherwise, I think the most powerful R&D tool is actually the Quantopian community already in place, with their quantitative researcher (ThomasW) providing interesting bits.

that said, I think your OP is valid so I don't want to derail you from your objective: getting some response from Quantopian. So I'll stop sidetracking now.

I would say some way to parameterize algorithms and a way to fit the parameters is a good start. Also better ways to present (and a way to export) backtest results and more permissive logging that can also be exported and used for offline analysis.

Hello Jason,

Data loaded via 'fetch' is loaded before the start of the trading i.e. once per day or is loaded when the algo is restarted during a trading day. See: https://www.quantopian.com/posts/quandl-python-api

P.

Hi Jason, and SS TT,

Yes, both free-wheeling access to data streams and flexibility in how backtest algorithms are run would be great. If the scalability problem could be solved economically, then one could imagine being able to call a backtest function (or a set of backtests) from a script, and then analyze the results across backtest parameters (e.g. find a global optimum, plot a heat map, etc.). Parallel processing would speed things up (Thomas Wiecki provides an example on https://www.quantopian.com/posts/zipline-in-the-cloud-optimizing-financial-trading-algorithms using zipline).

Grant

I guess I must be a minority here, but I think quantopians value proposition is severely diminished if you can't do something like this in the handle_data() loop:

data = httpRequest("http://someserver.com/?sec={0}&price={1}&volume={2}&timestep={3}".format(sec.symbol,data[sec].price, data[sec].volume, data[sec].datetime)  
#do something with the returning data  

otherwise, realtime datafeeds such as http://www.theflyonthewall.com/ are totally useless. Fine with me because I'm not looking to be a professional trader, but for some like Grant who are interested in doing this professionally, I can't see how that can be achieved without data feeds per timestep.

Hi Jason,

For the record, I'm not looking to be a professional trader. I've just been regularly involved on Quantopian and enjoy learning about the field and the intellectual challenge (in parallel, little by little, I'm picking up Python, too). More R&D capability would make it easier to get up the learning curve, I think.

Grant

Hi Jason,

The example you gave is an excellent illustration of exactly why we do not currently allow HTTP call-outs, or any other kind of calls out to third-party services, from within handle_data.

We are contractually obligated to keep the pricing data being fed into user algorithms from leaving our site. As your example illustrates, if we allowed call-outs to arbitrary URLs, those call-outs could be used to export the pricing data.

And there are other issues. Considering how fast our backtests run, if you make HTTP calls in every invocation of handle_data, it's likely that you'll overload whatever service you are calling out to. In addition to harming that service, you could also get us in trouble with Amazon. Depending on how much data is sent and received, it could make handle_data run too slowly, and too much bandwidth could be too expensive for us to to support. And then of course there's the fact that allowing HTTP call-outs would essentially turn Quantopian into a free DDoS launching platform. As you might imagine that's not a business we want to be in.

We hope to make it possible for algorithms to call out to certain external services in the future, though as I've just explained, how to do that is not at all straightforward and we don't yet have the answer. Furthermore, we intend to make additional data sources -- not just historical equity pricing data -- available to user algorithms. Some of the additional data sources will be free just like our pricing data, while others will be premium offerings. Our users have been telling us what data they want access to within Quantopian, and I'm sure they'll continue to tell us, and we hope to provide the data our users need to be successful.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hello Jason (and Jonathan),

The Quantopian restriction on not distributing pricing data is perhaps the biggest fundamental impediment to expanding R&D -oriented capabilities. If this nut could be cracked, then it would provide a lot more flexibility. As Jonathan points out, bandwidth and computing power aren't free, but there should be ways to apply limits to keep costs in check (e.g. run batches over-night/-weekend at low-priority).

In my mind, I keep asking "Why aren't the pricing data free?" I'm not talking about Quantopian, but across the industry in general. If anyone has insights, I'm all ears.

Quantopian uses a real-time Nanex data feed (converted to minute bars on-the-fly, I think) for live trading. In keeping with their existing architecture, I think that additional feeds would need to be merged with the Nanex feed. I don't have experience, but I gather that fetcher provides this functionality in a scalable fashion (my sense is that Quantopian would like to get to ~10,000 users live trading), but the downside is that it is not real-time.

Grant

Along similar lines, personally I would be much more enthusiastic about Quantopian as a R&D environment if there was an offline mode. Using IPython and your favorite editor would allow for quicker prototyping.

The offline code part should be easy since zipline is open source. I understand the issue is the market data access but would it be possible to release say a months worth of data for a few tickers or possibly fake market data so we could setup an offline version?

yes, I'm in Thailand and have an intermitent internet connection. about 1 out of every 3 backtests times out on me. that, and since backtests take a min or so is a bit annoying. I would be happy to pay a subscription to get an offline option (or greatly improved performance from online)