Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Masking Custom Factors

Ok, I'm trying to create a mask on a CustomFactor. My end goal is to find the Mean or Median P/E ratio of my base universe.

In this algo I just have a random custom factor that I'm trying to mask. I've spent about an hour playing around trying to mask my custom factor but keep getting the error you'll see below.

import numpy as np  
from quantopian.algorithm import attach_pipeline, pipeline_output  
from quantopian.pipeline import Pipeline, CustomFactor  
from quantopian.pipeline.data.builtin import USEquityPricing  
from quantopian.pipeline.factors import SimpleMovingAverage, AverageDollarVolume  
from quantopian.pipeline.data.psychsignal import stocktwits  
from quantopian.pipeline.data import morningstar  
from quantopian.pipeline.filters.morningstar import IsPrimaryShare





def do_something_expensive():  
#This is just an example but I'm trying to figure out sum all pe_ratios first, then divide each securities PE by the total PE of the universe of stocks  
    total=morningstar.valuation_ratios.pe_ratio.sum()  
    PE=morningstar.valuation_ratios.pe_ratio.latest  
    return PE/total



class MyFactor(CustomFactor):  
    #I don't understand this input part. What needs to go here?  
    inputs=morningstar.valuation_ratios.pe_ratio.latest  
    #I don't understand what each of these arguments in the compute function mean either?  
    def compute(self, today, asset_ids, out, close):  
        out[:] = do_something_expensive(close)  



def initialize(context):  
    pipe=Pipeline()  
    pipe=attach_pipeline(pipe, name='my_pipeline')  



    sma_10=SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=10)  
    sma_30=SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=30)  
    tradersentiment=stocktwits.bull_scored_messages.latest  
    tradersentiment2=stocktwits.bear_scored_messages.latest  
    screen1=tradersentiment>tradersentiment2  
    prices_under_5=2<sma_10<5  
    screen2=AverageDollarVolume(window_length=10, mask=prices_under_5).percentile_between(90,100)


    finalverse=screen2&screen1  
    #I want to mask my custom factor to ONLY RUN on my securities listed by screen2, however I continue to get an error 'ValueError: Latest does not have multiple outputs. There was a runtime error on line 54.'  
    my_factor = MyFactor(mask=screen2)


    pipe.add(sma_10, '10 day average')  
    pipe.add(sma_30, '30 day average')  
    pipe.add(tradersentiment, 'Bullish')  
    pipe.add(tradersentiment2, 'Bearish')  
    pipe.add(my_factor, 'my_factor')  
    pipe.set_screen(finalverse)  



def before_trading_start(context, data):  
    results=pipeline_output('my_pipeline')  
    print results.head(10)  
    context.pipeline_results=results  


2 responses

The problem isn't with your mask but rather with the line

inputs=morningstar.valuation_ratios.pe_ratio.latest  

Inputs should be a list of one or more BoundColumns. A BoundColumn specifies a particular column of data from a particular dataset. See https://www.quantopian.com/help#zipline_pipeline_data_dataset_BoundColumn for a description. See https://www.quantopian.com/help#importing-datasets for examples of the datasets you can use. You can check out the columns (or fields) that each dataset offers here https://www.quantopian.com/data. There is also a "builtin" dataset that has open, close, volume etc for each security.

In your case, if you want to use the pe_ratio data you would first find where that data resides and then import that dataset. In this case it's in the "quantopian.pipeline.data.morningstar" dataset. So, typically the first lines of your code you will do all your imports:

from quantopian.pipeline.data import morningstar as mstar   # you can give it any local name you wish  

Then in your CustomFactor you would specify you want to use pe_ratio as an input to your calculation or "compute" method:

input = [mstar.valuation_ratios.pe_ratio]  

Notice the brackets which is python syntax for creating a list which input expects. If you want to use multiple pieces of data (or columns) in your compute function then simply add them inside the brackets and separate with commas.

Set the "window_length" parameter to how far back (in trading days) you want to get data for each column you specify. If you simply want the latest (one day back) value for each, then set

window_length = 1  

Finally, to pull it together, you questioned "I don't understand what each of these arguments in the compute function mean either?". Look at the documentation under https://www.quantopian.com/help#quantopian_pipeline_CustomFactor for a description of each parameter. The bottom line is always list the first four parameters without modification. The fifth and subsequent parameters are how your inputs get passed to the compute function. One variable for each column of data listed in the input assignment. You can name these anything you wish. The order will be the same as the order they were assigned in the input statement.

def compute(self, today, asset_ids, out, pe_ratio):  

This passes a numpy array of pe ratios to the function where it can be referenced as "pe_ratio". There will be a row for each security (about 8000 or so) and a column for each day (total columns is equal to the window_length).

One final note is that when using a mask, it does not filter or screen the results. The entire universe of securities is still passed in and outputted from the factor. Using a mask simply bypasses the compute function for any securities filtered by the mask. It outputs a NaN for the value of these securities. It's typically just used to save compute time if your compute function is time intensive.

Good luck.

Hey Dan,

Sorry to respond just now. Been traveling all week for work. This all makes perfect sense. I appreciate the response.I'm going to spend some time playing around with what you've given me and let you know if I have any more questions. But you really cleared up most all the concepts I was unsure about.

I'll let you know how it turns out tomorrow sometime