Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Looking for Help with "window_length" in a Custom Factor

I've been having some issues trying to implement a custom MACD factor in my pipeline. I want to use the talib.MACD function in a custom factor rather than using the built-in MovingAverageConvergenceDivergence factor because of the multiple outputs, more flexibility. However, I'm not really sure how to handle the "window_length" that must be defined or passed.

Here is the custom factor:

#custom MACD factor for the pipeline  
class MACDFactor(CustomFactor):  
    inputs = [USEP.close]  
    #no idea what is going on with this value  
    window_length = ?????  
    #handling NaNs... this confuses me, got the code from another notebook  
    def compute(self, today, assets, out, close):  
        anynan = columnwise_anynan(close)  
        for col_ix, have_nans in enumerate(anynan):  
            if have_nans:  
                out[col_ix] = nan  
                continue  
            #calculate the talib MACD data  
            macd, signal, hist = MACD(close[:, col_ix])  
            #output the MACD histogram  
            out[col_ix] = hist[-1]  

And here is where it is called from:

def make_pipeline():  
    #call the MACD custom Factors, will probably  
    #need to pass window_length here if it has to get calulated dynamically  
    macd_hist = MACDFactor()  
    #construction of the pipeline  
    return Pipeline(  
        columns={  
            'macd_hist': macd_hist,  
            'latest_close': USEP.close.latest  
        },  
        screen=macd_hist.notnan()  
    )  

I know that the default numbers that MACD uses in its calculations are 9, 12, and 26 days.

Originally I tried using those as window_length, but they all return empty dataframes. In fact, any length from 1 to 33 returns empty dataframes.

For some reason at window_length = 34, the dataframe begins filling with data, but the MACD data is not the same as the MACD data returned by a talib.MACD function outside of the pipeline.

When window_length = the number of trading days in my backtest, then the MACD data returned by my factor is the same as MACD data returned by the talib function outside of the data frame.

I can increase window_length up to a couple thousand and it keeps filling the dataframe with incorrect data.

I'm just kind of confused by what is going on and how to best handle it. I know that window_length should probably equal the number of trading days in my backtest, but why? The furthest the MACD factor should ever need to look back is 26 days, right? Is there a built-in function that calculates the number of trading days in my backtesting window? I tried "(pandas.to_datetime('end date') - pandas.to_datetime('start date')).days" but I think that just gives me all the days, and it's still manual. Additionaly, how should I handle this when I want to start paper trading? The window_length might have to get incremented by 1 each day maybe? Seems like I'm doing something wrong.

I've just moved over to the backtesting enviroment, and I'm still learning Python. Thanks for the help in advance!

1 response

For anybody with the same question:
Check out Eddie's comment here and this page.