Minutely Backtesting, Compute Cost Etc.

Back to Community

edited Sep 8, 2013

Hello Quantopian,

I now feel the need to start running minutely backtests but this raises some issues around how long a minutely backtest may take to run especially for backtests spanning years with multiple securities.

The first issue is fairly trivial. Could the Full Backtest results page be amended to show the elapsed time for a completed backtest? I'm looking at an algo that may take an hour or more to run so I would just leave it running and return later.

The second is more problematical. What limits do Quantopian currently impose on the compute cost of a user and their algorithms? Obviously you are paying buy the (milli)second for our algos to run on your Cloud infrastructure. And looking forward what limits might you impose for non-revenue earning (from your perspective) users?

4 responses

Dan Dunn

Sep 10, 2013

Currently, the limits are crude and loose. Your algo churns away on a single CPU, and if it ever takes longer than 50 seconds to run a single frame of handle_data(), it throws an error and stops. That's meant to loosely conform to live trading - if your algo takes longer than 50 seconds to process a one-minute bar, then your algo risks "falling behind" real time. One can imagine that you come up with an algo that requires more computation than that - maybe you need 2 CPUs, or 4, or you need a GPU cluster to get your work done in 50 seconds.

The short answer is that the single CPU backtest is probably going to remain free forever. If you have an algo that needs more, at some point we'll offer the beefier computation at some metered price.

In the long run, it's in our interest to keep processing free and/or as cheap as possible. We want you to test as many ideas as you can so that you can find one that makes you money. And, we don't want you to waste time optimizing a bad idea to fit into some CPU limit - we want you to find a good idea, and THEN optimize it. And, we want to pay as little as possible ourselves for CPU usage, so we have a lot of incentive to optimize the heck out of the backtester.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks, Dan.

I wrote a nice long answer yesterday, and I forgot to post it. Take 2:

The elapsed time on a backtest is actually there, but it's hidden. In the upper-right corner, the info button next to share results, the drop down menu is "view code" and "backtest details." Backtest details has what you're looking for.

The current computing limits are crude and loose. Your backtest runs on one CPU. If your algo takes longer than 50 seconds to run any call to handle_data(), it throws an error and stops. That loosely conforms to live trading where if you're taking longer than 50 seconds to do handle_data() then you would fall behind realtime, and that's not good. I think that level of processing is going to be free forever. We expect that there will be more CPU intensive algos coming. To support those, we'll make a paid offering for multi-CPU or GPU machines. Presumably we'll charge some metered amount for that extra processing.

In the big picture, it's in our interest to make your processing as free/cheap as possible. We want you to test many many ideas and come up with ideas that will make you money. We're going to get more people testing if we keep it free. Also, we don't want you to optimize the CPU usage on some idea that turns out to be a loser - we want you to find a great idea, and THEN worry about optimizing for the CPU. And since we're trying to give away as much processing as we can, we're always going to try to optimize our code to make your testing more efficient.

Disclaimer

Peter Cawthron

Sep 11, 2013

Hello Dan,

Thanks - I've never looked there before. An issue, maybe, is the 'N/A':

This was run at 16:28 here in the UK.

You've successfully submitted a support ticket.

Our support team will be in touch soon.