Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Batch Transform In Minute Backtests

Can someone clarify what I see here, please? With this code:

def initialize(context):  
    context.stocks = [sid(2), sid(24)]  
def handle_data(context, data):  
    d = get_daily_prices(data, context)  
    if d is None:  
        return  


@batch_transform(window_length=1, refresh_period=10)  
def get_daily_prices(datapanel, context):  
    d = datapanel['price']  
    print d  
    return d  

I get this output:

2012-01-03PRINT<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 390 entries, 2012-01-03 14:31:00+00:00 to 2012-01-03 21:00:00+00:00 Data columns (total 2 columns): 24 390 non-null values 2 390 non-null values dtypes: float64(2)  
2012-01-17PRINT<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 390 entries, 2012-01-17 14:31:00+00:00 to 2012-01-17 21:00:00+00:00 Data columns (total 2 columns): 24 390 non-null values 2 390 non-null values dtypes: float64(2)  
2012-01-18PRINT<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 390 entries, 2012-01-17 14:32:00+00:00 to 2012-01-18 14:31:00+00:00 Data columns (total 2 columns): 24 390 non-null values 2 390 non-null values dtypes: float64(2)  
.
.
2012-01-18PRINT<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 390 entries, 2012-01-17 21:00:00+00:00 to 2012-01-18 20:59:00+00:00 Data columns (total 2 columns): 24 390 non-null values 2 390 non-null values dtypes: float64(2)  
2012-01-31PRINT<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 390 entries, 2012-01-31 14:31:00+00:00 to 2012-01-31 21:00:00+00:00 Data columns (total 2 columns): 24 390 non-null values 2 390 non-null values dtypes: float64(2)  
2012-02-01PRINT<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 390 entries, 2012-01-31 14:32:00+00:00 to 2012-02-01 14:31:00+00:00 Data columns (total 2 columns): 24 390 non-null values 2 390 non-null values dtypes: float64(2)  
2012-02-01PRINT<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 390 entries, 2012-01-31 14:33:00+00:00 to 2012-02-01 14:32:00+00:00 Data columns (total 2 columns): 24 390 non-null values 2 390 non-null values dtypes: float64(2)  

The second line of output is adavnced 10 days from the first. The third line of output is advanced one minute from the second. This continues until the end of that day when there is another 10 day advance followed by minutely ones.

Also, I don't understand the relationship between the date of the print statement and the start date of the data in the dataframe.

Is this by design?

P.

5 responses

Hi Peter,

I've never understood the thinking behind the refresh_period parameter. For minute data, I just set refresh_period = 0, which causes the batch transform to update every minute, with a window size of 390*window_length minutes. The batch transform will then start returning data once the window is full; until then, it'll return None. So, the first datetime with data will correspond to the datetime when the batch transform accumulator has just filled up (e.g. for window_length =1, with a normal length trading day, it'll be the last minute of the trading day).

Will this work for you? Or perhaps you are trying to do something else?

Grant

Hello Peter,

I sent an e-mail to Thomas Wiecki, who was involved in setting up the batch transform. He can describe how the batch transform should work with non-zero values of refresh_period under minutely trading.

Grant

Hello Grant,

Thank for taking the time to look at this. I want to get a DataFrame of daily closing prices to use in a minutely backtest. It may be that a batch transform is not the way to go.

My confususion in the example above is that the first call to the batch transform behaves differently to the subsequent ones. This is because the DataFrame has to be filled, I suppose.

P.

Hi guys,

Sorry for the confusion. In theory, the refresh_period argument was meant to be useful in cases where you fit a model that takes a while to train (think machine learning). In that case you might not want to re-train your model every day/minute.

This is the way it works in daily mode -- setting refresh_period=3 should call the batchtransform every 3rd day. In minute mode however there is a bug that Peter stumbled over.

We talked about this recently and rather than trying to fix the (hard to understand) batch_transform we came up with something much easier to use. Eddie is actually working on that right now. I think we'll share the proposal here and see what you guys think.

As to your specific problem, Peter: I think you have to set refresh_period=0 and just check for the most recent time stamp. Or you could just wait for the new thing.

Thomas

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hello Thmoas,

Thanks. I look forward to the new method. Please share in a a new thread as soon as you are able.

P.