Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Zipline Question

How do I multiply columns in a zipline dataset?

For example, if I am looking at Psychsignal sentiment data, how would I multiply bull_minus_bear * total_scanned_messages ?

9 responses

Assuming one wants a column in a pipeline dataframe to contain the value of two factors multiplied together then simply use the * operator. Something like this.

from quantopian.pipeline.data.psychsignal import stocktwits_free 

bull_minus_bear = stocktwits_free.bull_minus_bear.latest  
total_scanned_messages = stocktwits_free.total_scanned_messages.latest

my_multiplied_factor = bull_minus_bear * total_scanned_messages

pipe = Pipeline()  
pipe.add(my_multiplied_factor,  'my_multiplied_factor')

This could also be done in one line

my_multiplied_factor = stocktwits_free.bull_minus_bear.latest * stocktwits_free.total_scanned_messages.latest

Maybe that wasn't what the question was?

Sorry- I should have clarified. I am trying to take a moving average of the product of the two columns. In your example, that takes the last known value of each column so windowed calculations fail.

I attached the workbook I am using. It should be pretty straight forward, I am just trying to get familiar with pipeline.

One can't easily pass 'non windows safe' factors to other factors. However, it's often easy to just make a custom factor. I believe all you want to do is this...

class Multiplied_Mean(CustomFactor):  
    # Set window length to whatever is desired  
    window_length = 3  
    inputs = [atws_pipe.bull_minus_bear, atws_pipe.total_scanned_messages]  

    def compute(self, today, asset_ids, out, bull_minus_bear, total_scanned_messages):  
        out[:] = np.mean(bull_minus_bear * total_scanned_messages, axis=0)

See attached notebook.

Thanks for your help. As a follow up, now that I have the custom factor setup- how do I take the single period return of that custom factor?

Bump

If one wants to check how predictive this custom factor may be over 1 day (which I believe the last question is?). Take a look at the Alphalens tool (https://www.quantopian.com/posts/alphalens-performance-analysis-of-predictive-alpha-factors-1)

Attached is a notebook to get you started.

Hope that helps.

Hey, I actually want to use the first custom factor in the derivation of a second custom factor.

So in other words, If my first custom factor calculates the SMA of something then I would like to be able to use the SMA values as an input to another custom factor. Does that make sense?

Ah, sorry for the confusion. Maybe take a look at this post https://www.quantopian.com/posts/question-custom-factor-as-inputs-for-another-custom-factor-in-pipeline . There are probably others too.

Basically, one needs to do two things. First, add 'window_safe = True' to the custom factor definition. That's what allows a factor to be used as an input to another factor. Then, input that factor to the other factor. Something like this.

Define our custom factors.

class Multiplied_Mean(CustomFactor):  
    '''  
    Custom factor to return the mean value of the bull_minus_bear * total_scanned_messages  
    '''  
    # Set window_safe = True to be able to use output in other factors  
    window_safe = True


    # Set window length to whatever is desired  
    window_length = 3  
    inputs = [atws_pipe.bull_minus_bear, atws_pipe.total_scanned_messages]  
    def compute(self, today, asset_ids, out, bull_minus_bear, total_scanned_messages):  
        out[:] = np.mean(bull_minus_bear*total_scanned_messages, axis=0)  



class Double_Factor(CustomFactor):  
    '''  
    Simple (silly) factor which simply doubles the input. Used to test  
    '''  
    window_length = 1  
    def compute(self, today, asset_ids, out, input_values):  
        out[:] = input_values * 2  

Then use the factor as an input to another.

    # Instantiate our first factor  
    sentiment_score = Multiplied_Mean(window_length=3, mask=base_universe)  


    # Use factor from above in another factor  
    score_times_two = Double_Factor(inputs=[sentiment_score], mask=base_universe)  


See attached notebook. Good luck.

Super helpful, thanks Dan!