Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Custom factor

I'm trying to write my own custom factor. I'm trying to run a function fit_test onto each stock in the pipeline and return the data to a column in the pipeline. In the docs,I see simple implementation of Momentum. I did'nt see anything for more complicated customfactors.I'm trying to play around with various models. I'm not sure if I'm doing this correctly.

I create my pipeline

# Attach data pipelines  
    p1=attach_pipeline(  
        make_pipeline(),  
        'data_pipe'  
    )  
    my_factor = test(inputs=[USEquityPricing.close], window_length=50)  
    p1.add(my_factor, 'forecast')  

I create my custom factor. I'm trying to run the function fit_test on each stock for the last 30 days of price data

class test(CustomFactor):

    def compute(self, today, asset_ids, out2, values):  
        out=[]  
        len_values=len(values)  
        for i in xrange(0,len_values):  
            column=[item[i] for item in values]  
            cleanedList = [x for x in column if str(x) != 'nan']  
            if len(cleanedList) > 5:  
                yhat = 0  
                param1,param2,param3 = fit_test(cleanedList)  
                len_cleanedlist=len(cleanedList)  
                yhat =  ....  
                out.append(yhat)  
        out2=out  
        print out2  
        return out2

I test my code using this:

    all=context.output.query('yesterday_close>0')  
    print all

The "print out2" displays data with valid numbers.
The "print all" displays all the forecast as NaN.

Any pointers on how to write something like this. Thanks in advance.

5 responses

The best way to apply a function to each asset is to use the numpy apply_along_axis method. See https://docs.scipy.org/doc/numpy/reference/generated/numpy.apply_along_axis.html

In your case, to run the function 'fit_test' on each stock using the last 30 days of price data, one could do something like this.


# Define our Custom Factor

class Test(CustomFactor):  
    inputs = [USEquityPricing.close]  
    window_length = 30  


    def compute(self, today, asset_ids, out, values):  
        # Define the function one would like to apply for each asset  
        # The function will be passed a 1D numpy array of values  
        def fit_test(asset_values):  
            # Function can be anything but must return a single real number (or NaN)  
            # We'll just take the average of the values (ie average price)  
            average_price = np.nanmean(asset_values)  
            return average_price  


        # Apply the above function across each column and output the values  
        # The zero means axis 0 which passes columns of the ndarray 'values'  
        out[:] = np.apply_along_axis(fit_test, 0, values)  

See attached notebook for this in action. Good luck.

Thanks Dan! Thats very useful

Is there a way to adjust timeouts. Anything but the simplest stuff times out. Also it looks like, 8237 values are passed into the factor each time. Anyway to filter this? I'm running on s&p 500 stocks only

One can't really adjust the timeouts. The 'before_trading_start' method gets 5 minutes but running the pipeline gets 10. See this post https://www.quantopian.com/posts/before-trading-start-timeout-fix .

However, it is a good idea to mask, or filter, the assets to factors so they only compute values for the assets one is interested in. This definitely saves time. The notebook above does this. Simply supply a 'mask' when instantiating a factor.

# Import any  built in filters one wishes to use  
from quantopian.pipeline.filters import Q500US 


# Make a filter to get the desired assets. In this case the Q500US stocks.  
my_filter = Q500US()


# Make our factor  
# Use a mask limits the computation to only the specified assets.  
my_custom_factor = Test(mask = my_filter) 


Thanks Dan. Let me try that.