Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Quantiopian Bug? works on research Vs fails on backtest

I have the following piece of code that I have verified to work on Quantopian Research.
Basically gets fundamentals and adds a few custom calculated fields/rows, names, z-score, grouped z-score and grouped mean.

import pandas  
import pandas as pd  
import numpy as np  
from scipy import stats  
fundamentals = init_fundamentals()

import datetime

today = datetime.datetime.now().strftime('%Y-%m-%d')

fund_df = get_fundamentals(query(fundamentals.valuation_ratios.pe_ratio,  
                                 fundamentals.valuation_ratios.ev_to_ebitda,  
                                 fundamentals.valuation_ratios.sales_yield,  
                                 fundamentals.operation_ratios.roic,  
                                 fundamentals.operation_ratios.financial_leverage,  
                                 fundamentals.asset_classification.morningstar_sector_code)  
                             .filter(fundamentals.valuation.market_cap > 1e9)  
                             .filter(fundamentals.valuation_ratios.pe_ratio > 5)  
                             .order_by(fundamentals.valuation.market_cap),  
#                             .limit(4),  
                             today)

# OK, let's check out what we get back.  
# When we provide a query and a date, we get back the same type of response  
# as in the IDE: a dataframe with securities as columns and each requested  
# metric as rows.

def add_zscore(fund_df,columns):  
    df=fund_df.loc[columns].T.dropna().apply(stats.zscore).T  
    df = df.rename(dict((i,i+'_zscore') for i in columns))  
    df = pd.concat([fund_df,df])  
    return df

def add_gzscore(fund_df,columns):  
    df=fund_df.T.dropna().groupby('morningstar_sector_code')  
    df=df[columns].transform(stats.zscore).T  
    df = df.rename(dict((i,i+'_gzscore') for i in columns))  
    df = pd.concat([fund_df,df])  
    return df

def add_gmean(fund_df,columns):  
    df=fund_df.T.dropna().groupby('morningstar_sector_code')  
    df=df[columns].transform(lambda x: x.mean()).T  
    df = df.rename(dict((i,i+'_gmean') for i in columns))  
    df = pd.concat([fund_df,df])  
    return df

fund_df=add_zscore(fund_df,['pe_ratio','ev_to_ebitda','sales_yield','roic','financial_leverage'])

fund_df=add_gmean(fund_df,['pe_ratio','ev_to_ebitda','sales_yield','roic','financial_leverage'])

fund_df=add_zscore(fund_df,['pe_ratio','ev_to_ebitda','sales_yield','roic','financial_leverage'])

mean = fund_df.loc[['ev_to_ebitda','pe_ratio']].mean(axis=1)  
sd = fund_df.loc[['ev_to_ebitda','pe_ratio']].std(axis=1)

print mean  
print sd  
fund_df  

When I transfer the code into a back test algorithm I get
152 Error Runtime exception: TypeError: cannot concatenate a non-NDFrame object
The error line 152 is in function add_gzscore()

df = pd.concat([fund_df,df])  

more code context below.
I am stumped.
I can't figure out what the difference is, that I am missing

def add_zscore(fund_df,columns):  
    df=fund_df.loc[columns].T.dropna().apply(stats.zscore).T  
    df = df.rename(dict((i,i+'_zscore') for i in columns))  
    df = pd.concat([fund_df,df])  
    return df

def add_gzscore(fund_df,columns):  
    df=fund_df.T.dropna().groupby('morningstar_sector_code')  
    df=df[columns].transform(stats.zscore).T  
    df = df.rename(dict((i,i+'_gzscore') for i in columns))  
    df = pd.concat([fund_df,df])  # <<<<<------ Where backtest errors out  
    return df

def add_gmean(fund_df,columns):  
    df=fund_df.T.dropna().groupby('morningstar_sector_code')  
    df=df[columns].transform(lambda x: x.mean()).T  
    df = df.rename(dict((i,i+'_gmean') for i in columns))  
    df = pd.concat([fund_df,df])  
    return df  
def before_trading_start(context):  
    # only query database at the beginning of the month  
    month = get_datetime().month  
    if context.last_month == month:  
        return  
    context.last_month = month  
    # selected top K largest companies on NYSE  
    fundamental_df = get_fundamentals(  
        query(  
            fundamentals.valuation_ratios.pe_ratio,  
            fundamentals.valuation_ratios.ev_to_ebitda,  
            fundamentals.valuation_ratios.sales_yield,  
            fundamentals.operation_ratios.roic,  
            fundamentals.operation_ratios.financial_leverage,  
            fundamentals.asset_classification.morningstar_sector_code,  
        )  
        .filter(fundamentals.valuation.market_cap >= MARKETCAP_LIMIT)  
        .filter(fundamentals.share_class_reference.is_primary_share == True)  
        .filter(fundamentals.share_class_reference.is_depositary_receipt == False)  
        .filter(fundamentals.company_reference.primary_exchange_id.in_(["NYSE", "NYS"]))  
        .order_by(fundamentals.valuation.market_cap.desc())  
        .limit(SYMBOLS_LIMIT)  
    )  
    t=fundamental_df.T

    dft = add_gzscore(fundamental_df,['ev_to_ebitda','sales_yield','roic','financial_leverage']).T    

2 responses

Try putting your add_gzscore in an if len(t): -- I still suspect there's some sort of pre-start code check which doesn't work properly, so I find myself doing things like the above all the time.

Thanks Simon.

The folowing before add_gzscore()

if not len(t): return

did fix the code