Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Code Structuring Advice

Hi,
I have an algorithm which deals with Open, Low, Close and High data. The algorithm looks for some specific patterns. These patterns are not over a fixed number of bars as such, so there are a lot of loops in my algo to go over the price history multiple times trying to figure out the price relationships.
In calculating my final outputs, which is a trend value, along with specific importantprice levels, the algo creates many other temporary variables (around 20 or so). I have coded algorithms extensively in Amibroker and some in Java, the structuring of the code is very different for Amibroker code, as it works on each symbol's price history as an array of price values (OHLC), and I can create/update variables on every bar as I see fit.
My question is what is the best way to structure this kind of code for the Quantopian platform, which does a lot of to and fro traversing over price history while referencing multiple variables at different positions in the Array. There is around 1000 lines of such code.
My understanding is that all the data crunching happens in handle_data() function but if I try to do this kind of stuff in handle_data(), I end up with timeouts. Are there any best practices around structuring the code dealing with this kind of logic? Any inputs appreciated.
Thanks.

9 responses

Try to do all the calculations vectorized with numpy or pandas primitives. These days, I try to only ever have one loop in my code, and that is the one that loops over my final desired positions to call order.

@Ashish, From my perspective as a beginner programmer, I think you can do what you want in Quantopian. It is basically Python with Pandas, Talib, and Numpy modules (and other) and a few unique functions available including history(), order_target_percent(), etc. You can update variables once per minute.

I just read that you are getting timeouts. There are many ways to fix this: use fewer securities, use shorter history, and most importantly, optimize your loops. For instance, you only really need to call history one time, and save it as an object. That object can then be referenced many times inside your algo.

BTW, my code has lots of loops, because my understanding of Vectorized Primitives are monkeys who can shoot arrows.

Good luck!

Thanks SImon & Tristan,
I am working on making the code vectorized. My main difficulty is with code like:

if(var_one[i] > var_two[i-1] and var_one[i] < var_four[i]){  
        var_one[i] = new_value;  
        var_six[i] = some_value;  
}
else{  
        var_one[i] = other_value;  
        var_six[i] = var_two[i-1];  
}
.....

It is this kind of logic where multiple variables need referencing and/or updating, based on some conditions, where I get stuck with vectorization.
I hope I can get cracking with this though as I really like the concept behind Quantopian.

Ashish,

Note that the time-out of before_trading_start() is 5 minutes:

https://www.quantopian.com/posts/test-of-before-trading-start-time-out

So, if you can do computations in before_trading_start() you could give yourself some breathing room.

In your example above, I gather that you are needing to loop over i and so the logic gets applied repeatedly? You might have a look at:

http://docs.scipy.org/doc/numpy-1.10.0/reference/routines.logic.html

Maybe there is some way you can do element-wise comparisons?

Thanks Grant,
My understanding of before_trading_start() is to use the pipeline to create a trading universe, which is handy wherever it can be applied - can trading logic - especially from historical bars, be processed there as well?.
There are some useful functions in the numpy link, I'll try and implement some of the stuff - especially the elementwise comparision could be quite useful.

Well, I'm not sure what's up with before_trading_start(). When I try to call history() from within it, I get:

IndexError: index 0 is out of bounds for axis 0 with size 0
There was a runtime error on line 7.

def initialize(context):  
    context.stocks = sid(8554)

def before_trading_start(context,data):  
    prices = history(5*390,'1m','price')  
def handle_data(context, data):  
    pass  

You may need to apply a work-around if history() cannot be called directly from within before_trading_start() (Q support, there's nothing in the help docs describing this limitation):

def initialize(context):  
    context.stocks = sid(8554)  
    context.prices = None

def before_trading_start(context,data):  
    # prices = history(5*390,'1m','price')  
    print context.prices  
def handle_data(context, data):  
    context.prices = history(5*390,'1m','price')  

Effectively, you'd end up needing to skip the first day of trading of your backtest and live trading, since you'd have no trailing window of data in before_trading_start() until the second day. Once you get beyond the first day, though, you'd have an up-to-date window, since history() will get called at the close of the prior day and store the result in context.

Pre-processing the history bars could be quite useful in cutting time spent in handle_data(). I guess it means that I should declare all the variables as part of initialize(context) as

context.some_var = None

so that they are accessible from before_trading_start(), is that right?

@Grant, looks like you found a bug and we'll fix it.

@Ashish, You're on the right track. If you need to make intense computations prior to the trading day, you should do this in before_trading_start.The algo has 5 minutes to process the data. Then, during trading hours, handle_data has 1 minute to run to stay current with market time. To save state between the functions, assign your variables to "context" .

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Yes. You could take up to 5 minutes every day to do your pre-processing before the market opens, storing the results in context. Those results would then be available to handle_data or a scheduled function, and to before_trading_start the next time you call it.

Note that your algo with crash if you exceed a time-out, so I recommend making sure you have a guard band (e.g. check that your code in before_trading_start is executing consistently within 4-4.5 minutes max.)