Zipline Question

Back to Community

Zipline Question

posted Sep 17, 2018

How do I multiply columns in a zipline dataset?

For example, if I am looking at Psychsignal sentiment data, how would I multiply bull_minus_bear * total_scanned_messages ?

9 responses

Dan Whitnable

Sep 17, 2018

Assuming one wants a column in a pipeline dataframe to contain the value of two factors multiplied together then simply use the * operator. Something like this.

from quantopian.pipeline.data.psychsignal import stocktwits_free 

bull_minus_bear = stocktwits_free.bull_minus_bear.latest  
total_scanned_messages = stocktwits_free.total_scanned_messages.latest

my_multiplied_factor = bull_minus_bear * total_scanned_messages

pipe = Pipeline()  
pipe.add(my_multiplied_factor,  'my_multiplied_factor')

This could also be done in one line

my_multiplied_factor = stocktwits_free.bull_minus_bear.latest * stocktwits_free.total_scanned_messages.latest

Maybe that wasn't what the question was?

Zak Raicik

Sep 17, 2018

Sorry- I should have clarified. I am trying to take a moving average of the product of the two columns. In your example, that takes the last known value of each column so windowed calculations fail.

I attached the workbook I am using. It should be pretty straight forward, I am just trying to get familiar with pipeline.

Dan Whitnable

Sep 19, 2018

One can't easily pass 'non windows safe' factors to other factors. However, it's often easy to just make a custom factor. I believe all you want to do is this...

class Multiplied_Mean(CustomFactor):  
    # Set window length to whatever is desired  
    window_length = 3  
    inputs = [atws_pipe.bull_minus_bear, atws_pipe.total_scanned_messages]  

    def compute(self, today, asset_ids, out, bull_minus_bear, total_scanned_messages):  
        out[:] = np.mean(bull_minus_bear * total_scanned_messages, axis=0)

See attached notebook.

Zak Raicik

Oct 4, 2018

Thanks for your help. As a follow up, now that I have the custom factor setup- how do I take the single period return of that custom factor?

Bump

If one wants to check how predictive this custom factor may be over 1 day (which I believe the last question is?). Take a look at the Alphalens tool (https://www.quantopian.com/posts/alphalens-performance-analysis-of-predictive-alpha-factors-1)

Attached is a notebook to get you started.

Hope that helps.

Zak Raicik

Oct 5, 2018

Hey, I actually want to use the first custom factor in the derivation of a second custom factor.

So in other words, If my first custom factor calculates the SMA of something then I would like to be able to use the SMA values as an input to another custom factor. Does that make sense?

Dan Whitnable

Oct 5, 2018

Ah, sorry for the confusion. Maybe take a look at this post https://www.quantopian.com/posts/question-custom-factor-as-inputs-for-another-custom-factor-in-pipeline . There are probably others too.

Basically, one needs to do two things. First, add 'window_safe = True' to the custom factor definition. That's what allows a factor to be used as an input to another factor. Then, input that factor to the other factor. Something like this.

Define our custom factors.

class Multiplied_Mean(CustomFactor):  
    '''  
    Custom factor to return the mean value of the bull_minus_bear * total_scanned_messages  
    '''  
    # Set window_safe = True to be able to use output in other factors  
    window_safe = True


    # Set window length to whatever is desired  
    window_length = 3  
    inputs = [atws_pipe.bull_minus_bear, atws_pipe.total_scanned_messages]  
    def compute(self, today, asset_ids, out, bull_minus_bear, total_scanned_messages):  
        out[:] = np.mean(bull_minus_bear*total_scanned_messages, axis=0)  



class Double_Factor(CustomFactor):  
    '''  
    Simple (silly) factor which simply doubles the input. Used to test  
    '''  
    window_length = 1  
    def compute(self, today, asset_ids, out, input_values):  
        out[:] = input_values * 2

Then use the factor as an input to another.

    # Instantiate our first factor  
    sentiment_score = Multiplied_Mean(window_length=3, mask=base_universe)  


    # Use factor from above in another factor  
    score_times_two = Double_Factor(inputs=[sentiment_score], mask=base_universe)

See attached notebook. Good luck.

Joakim Arvidsson (Cream Mongoose)

Oct 5, 2018

Super helpful, thanks Dan!

You've successfully submitted a support ticket.

Our support team will be in touch soon.