Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Refresh Period Bug?

I think there may be a refresh_period bug in batch_transform, as shown in the code below.

# When run in daily mode from 1/7/2008 to 1/15/2008, this code  
# shows a problem with batch_transform.

# The refresh_period argument to @batch_transform is set to 2,  
# so the logged price window should update every other day.  
# But if SHOW_BUG is True, the window updates every day:  
#    2008-01-07 PRINT No prices  
#    2008-01-08 PRINT No prices  
#    2008-01-09 PRINT [ 177.58  171.23  179.5 ]  
#    2008-01-10 PRINT [ 171.23  179.5   178.02]  
#    2008-01-11 PRINT [ 179.5   178.02  172.4 ]  
#    2008-01-14 PRINT [ 178.02   172.4    178.738]  
#    2008-01-15 PRINT [ 172.4    178.738  169.   ]

# If SHOW_BUG is False, the window updates every other day as  
# it should (except for the first day):  
#    2008-01-07 PRINT No prices  
#    2008-01-08 PRINT No prices  
#    2008-01-09 PRINT [ 177.58  171.23  179.5 ]  
#    2008-01-10 PRINT [ 171.23  179.5   178.02]  
#    2008-01-11 PRINT [ 171.23  179.5   178.02]  
#    2008-01-14 PRINT [ 178.02   172.4    178.738]  
#    2008-01-15 PRINT [ 178.02   172.4    178.738]

SHOW_BUG = True

# Initialize context to pass to other methods.  
def initialize(context):  
    context.stock = sid(24)  
# Handle trade data events.  
def handle_data(context, data):  
    # To show the bug, foo can be any property of  
    # data[context.stock], e.g. price, volume, etc.  
    foo = data[context.stock].volume if SHOW_BUG else 0  
    prices = transform(data, foo)  
    print prices if prices is not None else 'No prices'  
@batch_transform(window_length=3, refresh_period=2)  
def transform(datapanel, foo):  
    return datapanel['price'].as_matrix().flatten()

22 responses

Hello Michael,

I've noticed some problems with the batch transform, as well. See:

https://www.quantopian.com/posts/unexplained-error

https://www.quantopian.com/posts/batch-transform-minute-data

https://www.quantopian.com/posts/code-no-longer-running

Seems the recent change needs some thorough testing.

Grant

Hi,

Yes, there's a problem with the new batch_transform in minute mode. I have a fix that should make it to production soon.

Essentially, window_length before was interpreted as number of days. It now is interpreted as number of bars, so in minute mode window_length=1 will give you the last minute event. The fix involves multiplying window_length internally with 6.5 * 60 if a simulation is run in minute mode.

Sorry about that, I'll let you know once it's fixed. In the mean-time you could just do the multiplication yourself. refresh_period OTOH should work as before: if it's set to 1 it should update daily, not minutely.

Thomas

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

@Michael: I think you have to call transform.handle_data(). I might be wrong and we wrap it internally but you can try that?

Thanks Thomas,

If I understand correctly, you will revert to both the window length and refresh period specified in units of days (regardless of whether the backtest is run on daily or minute data). So, in minute mode, window_length = 1 will return a window of 390 minute bars, while in daily mode, the window will be 1 daily bar.

In minute mode, will a fractional window length be allowed (e.g. if the algorithm requires fewer than 390 bars)?

Grant

Yes, that's correct.

At first I'll just restore the original behavior to not break existing algos too much. But certainly allowing to specify number of bars is trivial to add as well. Not sure I like the fractional but let me know if you have any suggestions for an interface. Maybe just bars=10 will give you either 10 days or 10 minutes. Setting this would override window_length. Not ideal but maybe not too bad.

Thanks Thomas,

Say I'm only interested in a trailing window of 30 minute bars...I can just select the last 30 out of 390. So it is a matter of efficiency, since I'll be filling and maintaining a window of 390 bars when I only need 30. So, if the efficiency can be improved, then yes, it would make sense to give the user finer control over the window length. Otherwise, it might add clutter and confusion.

It's probably a topic for a different thread, but I'm curious how the folks who trade for a living handle the start of the trading day, when they are running an intra-day strategy (closing out trades prior to the end of the trading day). Do they use data from the prior day or after-hours data or foreign market data? Or higher frequency data than minute-level? It just seems that if the algorithm has to wait for 30-60 minutes to build up statistically strong signals, then a good portion of the trading day has already passed, right?

Grant

Alternatively we just break backwards compat and window_length defines number of bars from now on. Pretty direct and straight forward. But then refresh_period should probably also be bars.

@Grant: Thank you for the links. It looks like both minute mode, as in your examples, and daily mode, as in the code above, are having problems. I agree that thorough testing is needed.

@Thomas: I'm not sure what you mean by calling transform.handle_data() -- the transform function object in the code above has no handle_data() method, so trying to call it causes a runtime exception:

Runtime exception: AttributeError: 'function' object has no attribute 'handle_data'

Hello Thomas,

It appears that y'all reverted to the original batch transform behavior:

2013-05-02PRINT ----------------------------  
2013-05-02PRINT2013-05-02 13:31:00+00:00  
2013-05-02PRINT ----------------------------  
2013-05-02PRINT Securities  
2013-05-02PRINT[24, 8229, 8554, 5061, 8459]  
2013-05-02PRINT Current prices  
2013-05-02PRINT[ 442.789 77.91 158.66 32.42 54.83 ]  
2013-05-02PRINT Trailing price window  
2013-05-02PRINT[[ 444.03 77.95 159.27 32.99 56.11 ] [ 444.139 77.79 159.3001 33.015 56.4 ] [ 444.176 77.81 159.2965 32.98 56.23 ] ..., [ 438.79 78.1 158.26 32.695 54.2392] [ 439.43 78.06 158.2725 32.74 54.26 ] [ 442.789 77.91 158.66 32.42 54.83 ]]  
2013-05-02PRINT Trailing window length  
2013-05-02PRINT390  
2013-05-02PRINT ----------------------------  
2013-05-02PRINT2013-05-02 13:32:00+00:00  
2013-05-02PRINT ----------------------------  
2013-05-02PRINT Securities  
2013-05-02PRINT[24, 8229, 8554, 5061, 8459]  
2013-05-02PRINT Current prices  
2013-05-02PRINT[ 442. 78.02 158.66 32.48 55. ]  
2013-05-02PRINT Trailing price window  
2013-05-02PRINT[[ 444.139 77.79 159.3001 33.015 56.4 ] [ 444.176 77.81 159.2965 32.98 56.23 ] [ 444. 77.7 159.24 32.96 56.33 ] ..., [ 439.43 78.06 158.2725 32.74 54.26 ] [ 442.789 77.91 158.66 32.42 54.83 ] [ 442. 78.02 158.66 32.48 55. ]]  
2013-05-02PRINT Trailing window length  
2013-05-02PRINT390  

Seems like a prudent move, since otherwise, folks will have to fix their algorithms. Aside from the potential efficiency hit, I don't see a problem. With refresh_period = 0, one can obtain a trailing window of N*390 minute bars, updated every minute (with integer N > 0).

Also, the batch transform is noticeably faster--thanks!

Grant

Hi Grant,

Indeed we did. There also shouldn't be a performance hit with running a longer window and only using the last elements of it.

Ultimately it would make for better algocode if it could be specified in the way we talked about but as you say there's an easy way around it.

Are there any outstanding issues with the batch_transform?

Thomas

Updating here as well, (since there are multiple threads.)

We've restored the behavior of window_length in minute mode to mean 390 * window_length, i.e. the window is described in days but filled with minute data.

The fix in Zipline can be seen here, https://github.com/quantopian/zipline/commit/b87d454938388919a034d3b378cad070ab39c828

Thanks for helping us spot it!

(And we agree that there is/was a gap in Zipline where sometimes only the daily case for a certain behavior is tested. Both Thomas and I will be working to help rectify that situation.)

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks for the updates. It sounds like minute mode is working well now. For daily mode, the changes don't seem to have fixed the behavior of the code sample that I posted above. Maybe there's something wrong with the way the sample is written, or with my assumptions of how it should behave?

Ha, good catch. That's a bug indeed and I know exactly where :).

https://github.com/quantopian/zipline/issues/160

Back to the original case:

I thought some more about this and there is no right way to handle this. If you pass different arguments to the batch_transform and there is not a refresh we would ignore the arguments you are passing if we stick to the refresh_rate. Instead, the behavior you are seeing is that we are forcing a refresh if you pass a different argument. I think that's a better way to handle this than silently drop the passed arguments but I could be convinced otherwise. Thoughts?

Thomas,

I've only used Quantopian (not stand-alone Zipline), so I'm probably missing some subtleties. I agree that arguments shouldn't be silently dropped. But I don't understand why changed arguments have to be ignored if you maintain the refresh rate.

From the docs, I had three rules in mind:

  1. If a function needs a trailing window, decorate it with @batch_transform.
  2. To advance the window in whole day increments, use refresh_period=1.
  3. You can pass an extra parameter to the function; @batch_transform just forwards it.

With those rules, I tried to rank the current minute's price against yesterday's prices:

def handle_data(context, data):  
    current_price = data[context.stock].price  
    price_rank = get_price_rank(data, current_price)

@batch_transform(window_length=1, refresh_period=1)  
def get_price_rank(datapanel, current_price):  
    # … omitted code:  rank price against yesterday's prices  
    return rank  

But that doesn't work, because as you described, rules 2 and 3 conflict: If the parameter in 3 has changed, the refresh_period in 2 is ignored.

Although I'm not sure why 2 and 3 can't coexist, that's fine, I can work around it. Perhaps the doc should include a note about this behavior.

-- Michael

Hi Michael,

I agree about the documentation change.

In your case however, you don't have to pass current_price as it will be contained in the datapanel, no?

With refresh_period=1, I was thinking that:
-- the datapanel would only roll forward in whole day increments
-- so the last price in datapanel would always be yesterday's close minute (or today's open?), rather than the current intraday minute's price.

Is that incorrect?

Oh, this is in minute mode? Then with window_length=1 you one complete day of minute bars. The last one will contain the current price.

But this is with refresh_period=1, which the doc says to use "if you wanted your minutely simulation to only advance the window in whole day increments, so that the end of the window would always be through yesterday's close."

In your case you'll have to set window_length=1 and refresh_period=1 but only return the datapanel from the batch_transform. The ranking code would have to be done outside where you rank each event according to yesterday's datapanel.

OK -- thank you for the clarification.