Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Access elements returned from batch transform

I am using the example to get averages. In this setup, get_averages() should return an array right? But when I look at log.info the log shows "2008-01-04handle_data:24INFO". If I try to access averages as an array, for example averages[1], I get "Runtime exception: TypeError: 'NoneType' object has no attribute '__getitem__'". How do I access the averages array?

def initialize(context):  
    set_universe(universe.DollarVolumeUniverse(99.9, 100.0))

@batch_transform(window_length=10)  
def get_averages(datapanel):  
  # get the dataframe of prices  
  prices = datapanel['price']  
  # return a dataframe with one row showing the averages for each stock.  
  return prices.mean()

def handle_data(context, data):  
  averages = get_averages(data)  
  # add a newline to the beginning of the log line so that the column header of the  
  # is properly indented.  
  log.info('\n%s' % averages)  
5 responses

Hello Richard,

This confused me at first. You get the error because the returned Data Frame 'averages' is empty for the first 10 iterations of handle_data. You need:

def initialize(context):  
    set_universe(universe.DollarVolumeUniverse(99.9, 100.0))

@batch_transform(window_length=10)  
def get_averages(datapanel):  
    prices = datapanel['price']  
    return prices.mean()

def handle_data(context, data):  
    averages = get_averages(data)  
    if averages is None:  
        return  
    print averages[:1]  

Regards,

Peter

Thanks Peter that solved my problem. How can I add the symbol name or SID to the array of means? I don't see an attribute for that for 'data'.

Hello Richard,

I'm out of my depth now so hopefully someone else will advise you! A couple of ideas or maybe not:

def handle_data(context, data):  
    averages = get_averages(data)  
    if averages is None:  
        return  
    for sid in data:  
        print averages[sid]  

and

def handle_data(context, data):  
    averages = get_averages(data)  
    if averages is None:  
        return  
    context.stocks = [sid for sid in data]  
    print averages[context.stocks[0]]  

In the 'averages' Data Frame the row index is the sid.

Regards,

Peter

Hello Richard,

Here's the output of the attached algorithm:

1970-01-01initialize:14INFOuniverse is 98 to 98.1  
2013-05-06PRINTSecurities  
2013-05-06PRINT[448, 25090, 42788, 23881, 19662, 39546, 1595, 4922]  
2013-05-06PRINTPrices  
2013-05-06PRINT[[ 73.87 73.52 60.92 122.125 54.58 60.72 21.35 104.71 ] [ 73.41 72.545 59.71 122.29 54.33 57.55 20.33 104.56 ] [ 73.79 73.96 60.29 122.31 54.85 58.92 19.18 106.0625] [ 75.24 75.23 62.15 121.22 55.53 60.95 19.91 107.8 ] [ 76.02 76.09 63.87 121.13 55.71 62.65 21.02 107.89 ]]  
2013-05-06PRINTMean prices  
2013-05-06PRINT[ 74.466 74.269 61.388 121.815 55. 60.158 20.358 106.2045]  

The code is:

import numpy as np

# globals for get_data batch transform decorator  
R_P = 1  # refresh period in days  
W_L = 5 # window length in days

def initialize(context):  
    context.stocks = []  
    bottom = 98  
    range = 0.1  
    log.info("universe is {b} to {t}".format(b=bottom, t=bottom+range))  
    set_universe(universe.DollarVolumeUniverse(bottom, bottom+range))  
def handle_data(context, data):  
    context.stocks = [sid for sid in data]  
    # get data (select prices, volume, etc. in batch transform get_data)  
    d = get_data(data, context.stocks)  
    if d is None:  
        return  
    print 'Securities'  
    print context.stocks  
    print 'Prices'  
    print d  
    print 'Mean prices'  
    print np.mean(d,axis=0)

@batch_transform(refresh_period=R_P, window_length=W_L) # set globals R_P & W_L above  
def get_data(datapanel,sids):  
    return datapanel['price'].as_matrix(sids)  

The list of sids corresponds to the columns of the numpy ndarray returned by the batch transform.

Cheers,

Grant

This is great, thank you both!