Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Pipeline backtesting vs before_trading_start timeouts

Hi,

I know that in live trading, we are supposed to have a 5-minute time out during before_trading_start. How does this interact with pipeline pre-computing a year's worth of pipeline results? If the pre-computation exceeds 5 minute minutes, does that cause an exception in our algo? (it seems to, which seems a little harsh, since it unlikely ever would in live trading)

Simon.

8 responses

Hey Simon,
I'm not sure I understand what you are asking....what do you think should happen if your computation takes longer than 5 minutes? in backtesting or live trading. Full disclosure - we are just started to look into live trading with the pipeline API, so I haven't worked through the edge cases yet, but I'd love your thoughts on what you want the behavior to be.

KR

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi Karen,

The idea is that if the 1-year pre-computation takes place under the 5-minute time constraint, it is overly restrictive for backtesting. Under live trading, the computation would be done on a rolling basis, every day; there would be no pre-computation. So, you could end up filtering out algos that would run live with no excessive run-time of before_trading_start.

On https://www.quantopian.com/posts/introducing-the-pipeline-api, I had similar questions:

I don't understand "In backtesting, they are calculated in bulk, once per year." Is this done before the backtest even starts? Or will the backtest pause every year as it runs, to do the computation? And is the computation limited to 5 minutes? Or could it take longer? In backtesting, is there a way to turn off the pre-computation, to verify that the algo will run live?

As input to your thinking about live trading, in my opinion, it seems like bad style not to provide a means to catch a time-out error. Users should be able to write their own exception handling. Also, under live trading, it would be prudent to understand how the execution of before_trading_start will scale with the number of algos using it and the level of computation performed. For example, as a worst case, what if every live trading algo pulled a year's worth of data into memory and crunched on it for almost 5 minutes? And then wrote a bunch of data to context? Etc. You'll need to get a sense for the worst-case scaling, since you have no control over what code users will execute.

Grant

Hi Karen,

Yes, I meant what Grant mentioned - if we are to have 5 minutes in before_trading_start, then I would expect that to be lengthened in backtests since it will be doing more work than expected (in live trading), once a year. In backtesting pipeline algos, I've already come across unpredictability in whether the algo will time out doing pipeline stuff or not. Sometimes it does, sometimes it doesn't. This is doing a trivial calculation, but with a window_length of 300. I am doing a screen for the top 2500 stocks by market cap so it's not over the whole universe...

Karen,

Any idea what goes on under the hood that would cause variability in execution time? I thought each backtest/live algo was its own little virtual server? Maybe the virtualization means that the algo is coupled to the overall load on the system (it is not real-time)? Or sometimes the algo gets deployed to hardware that has lower performance? Is it I/O that causes the variability? Also, what are you trying to accomplish with the time out in the first place? If before_trading_start runs just before market open at 9:30, it makes sense that it needs to wrap up its business. But if it is running in the middle of the night, and you've already dedicated the computing resources to the algo, what difference does it make how long it runs? Why not make it 30 minutes? 1 hour? 6 hours?

Variable execution time doesn't make sense to me, and I'll have to look into it.

Simon, I'll be in touch to get more information.

Rich pointed out to me that variable execution timeouts can be caused by fundamentals queries and the number of backtests (or users) getting data from the fundamentals servers.

The good news is one of the perf improvements going out shortly will hopefully help this. We'll keep an eye on it.

Karen,

My guess is that unless you are running a real-time OS, there will be variability. Even then, the I/O will cause variability, as Rich pointed out. The problem is that you'll have people with code that they time to run at 4 min. 37 secs. and they'll think that they are fat, dumb, and happy. Then, one day, their gazillion-dollar Q hedge fund algo will crash due to a time-out. You could add a guard band, which you manage, based on the worst-case variability. You could issue a warning if the guard band is exceeded, so that when users are testing their algos, they would know that they are on thin ice. And, as I mention above, it would still be best practice to empower the user to catch the time-out error, rather than crashing the algo (the same would apply to hande_data calls).

Grant

"You could issue a warning if the guard band is exceeded, so that when users are testing their algos, they would know that they are on thin ice. And, as I mention above, it would still be best practice to empower the user to catch the time-out error, rather than crashing the algo (the same would apply to hande_data calls)."

I subscribe too!