Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
suggestion: adding learn_period

Hi,

Many algos need a period of learning at the beginning before it start trading. It would be great if Quantopian could add a parameter learn_period that allows algos to run such amount of trading days before the start date. During the learn period, all orders are ignored, but the internal state of the algo can be modified.

This way, a live algo that has learning can resume trading immediately when a small change is made.

Best,

13 responses

Hello Anh,

I'm not sure I follow. When you say "all orders are ignored" do you mean that they would be simulated, and then you would start with live trading? Are you basically saying you'd like to carry results from backtesting over into live trading? Or something else?

Grant

I think I know what you're saying Anh. You want algos to be able to just collect data from the recent past without have to occupy your live trading slot. Is that right? Basically doing the same thing as the history function, but without forcing you to only use recent data that actually CAN be collected with history(). I've wanted something like that for a while.

I'll try to translate Ahn with an example.
Say you want to write a pairs trading strategy. You need to evaluate "beta" (stockA = alpha + beta * stockB) during a learning period then use this parameter to get the size of your position for stockA and stockB. On top of that you may want to evaluate the half-life period and so on.

sorry for the confusion guys. Alex gave a good example. Lets say if I want to calculate beta in the last 60 days.

I know this can be done by running history for 60 days every time handle_data is called. But that is very inefficient, not to mention if I want to do 250 days. What I would do in my algo is to process one bar of data per handle_data and update beta accordingly. If handle_data can be invoked for a learn_period (here it is 60 days) before the start date of trading / simulation then the algo can calculate beta and be ready to trade on the first trading / simulation day.

Orders should be ignored when running on learn_period because algo model is not exactly reliable at that time.

Another case is algo which pick up low frequency events, like earning report, it would need some pre startdate run to have enough signals at startdate.

Anh,

Here's an example that does analysis at market close and then trades, based on the analysis, at market open. If I'm understanding correctly, you should be able to devise a 'learn_period' with schedule_function. Instead of doing the analysis every day, as I do, you could do it every 60 days, for example. So long as your computation takes no more than 50 seconds, it should work.

However, I would wonder if 'under the hood' history is getting updated every bar anyway? The clever Q engineers may have written code that 'sees' the history call within the function upon algo build, and then sets up to accumulate bar data in the background, as it arrives (in essence, history is always kept 'warm'). So, when you call history, the data are already in RAM; they aren't retrieved from disk. It isn't clear to me that there is any "running" of history; I suspect that the data are there automatically, whether it is called or not.

Grant

The issue with the history approach is that most rolling/expanding calcs won't naively recompute along window at each iteration. While that method is correct, it's magnitudes slower than doing some bookkeeping and adjusting the current stat with whatever data is entering/leaving. A proper "live" version of that stat will have similar optimizations.

I suppose one could run history once at start to simulate the warmup period then update live accordinly. Though that would make your code uglier.

Dale,

Good point. If the stat can be updated successively, using new data as it comes in and dropping the old, that would be best, without having to re-compute across an entire trailing window. Examples are given on http://en.wikipedia.org/wiki/Moving_average.

However, if it is convenient to use 'history' and there's other overhead in the code and back-end that take a lot longer, then it won't matter, especially if the use of 'history' and stats are infrequent (i.e. not every call to handle_data).

Grant

Hi Grant,

Like Dale says, what i am trying to avoid is using history(390,...) in your analyze function. Instead I just get the current bar and update "indicator" on a rolling basis. That should speed up the algo computational time by order of 1xx.

Without supporting for learn_period, we would either need to call history every time and compute "indicator" from scratch (which is costly), or initialize by using history once on first trading period, then switch to rolling afterwards (which is ugly).

It should not be too difficult for Quantopian to add support for learn_period. Basically run the algo for a period before start date and suppress all the orders that it create during that period.

Best,

Anh,

I don't think this demand is necessary: "*suppress all the orders that it create during that period*". In the learn period, you'll just calculate your parameters for the trading period, just like you use a 50 days look-back period to calculate today's value of a 50-day moving average. Unless, of course, your parameters depend on a P&L like a Walk-Forward Optimization Technique. Is that the case?

Anh,

Well, effectively, if you use history, then you won't be sitting in cash for the first part of the backtest (or live trading). So, I don't see any way around it. But maybe I'm still not following. What statistics, etc. are you needing to compute, and how often?

Sorry, I'm still not catching on to the problem you need to solve...

Grant

The suppressing of orders comes down to how you want to structure your code and models. At what level should you know that you are in learning period? How is this communicated? I can understand not wanting your model to have to differentiate between learning and live periods.

Personally, I prefer that strats/models are signal emitters and do not touch the order system directly. In that scenario the warming-up can easier conceptualized as just ignoring signals without needing to dirtying the model. Also opens better composability i.e. using a model as more an indicator for a meta model. Though even in that case, since you can't define the start_date simulation in code so you'd need to coordinate with the gui to get the proper warmup events.

@Alex, I want to suppress the orders because during learn period, the model is not reliable enough. Perhaps I should explain what it means to be "reliable" better. E.g. If i want to calculate rolling 250 days beta and have it, and it is just 10 days into the learn period then the beta value is not yet "reliable". Any orders invoked during the learn period relied on "unreliable" data and hence should not be executed / counted toward algo performance.

@Grant. Maybe a better example is funda data. There is no way for get historical data now except running through the historical date and save it in your algo, and that could very well be the case in the future. What would you do if your algo use quarterly eps and you want it to be ready to trade in the first day you submit it for live trading? My understanding is that now you would have to wait, in the worst case ~3 months. from the day your algo is submitted to the day live trading start.

@Dale: agree. it would be great if simulation and ordering can be separated. Another way to view what I am suggesting here is separating simulation startdate from ordering startdate, that way algos can do whatever preparation / data caching they need before making real orders. Combining algos into meta algo is definitely needed to build complex portfolios. That could be done at Quantopian level by combing the orders that are placed by algos from same user (in the same 1min interval for example) before sending to IB. I am not sure if it could be done at IB level. But that is a more complicated topic.

Anh,

I'm not so familiar with the fundamental data yet. If there is no "warm up" then yes, you'd have to accumulate it yourself. And since anything in context gets wiped out after backtesting, you are stuck. You could manually copy-and-paste historical fundamental data into a CSV, which would then get uploaded via fetcher, when you go live. I think that the terms of use disallow any copy-and-pasting of data to files outside the platform, so you'd need to get the data elsewhere to be in compliance. But, you could compare the data you obtain elsewhere with data output to either the debugger or the log window, to make sure they agree.

Grant