Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Anyone else having problems with morningstar customfactors?

So I have been working on this algorithm and recently it has stopped working. When building and running, it only runs for approx 7 days and then freezes. I've been trying to remove the pipe.add for morningstar factors and that seems to make it run smoothly again which is why I suspect the custom factors using morningstar is the trouble here. Does anyone have the same issue or any suggestions?

from quantopian.algorithm import attach_pipeline, pipeline_output  
from quantopian.pipeline import Pipeline  
from quantopian.pipeline.data.builtin import USEquityPricing  
from quantopian.pipeline import CustomFactor  
from quantopian.pipeline.data import morningstar  
import numpy as np

class DividendYield(CustomFactor):  
    inputs = [morningstar.valuation_ratios.dividend_yield]  
    window_length = 1  
    def compute(self, today, assets, out, dy):  
        out[:] =  dy[-1]*100  
class Debt_Ratio(CustomFactor):  
    inputs = [morningstar.balance_sheet.current_debt, morningstar.income_statement.total_revenue]  
    window_length = 1  
    def compute(self, today, assets, out, debt, revenue):  
        out[:] =  2 #debt[-1] / revenue[-1] * 100  

# DollarVolume will calculate yesterday's dollar volume for each stock in the universe.  
class DollarVolume(CustomFactor):  
    # We need close price and trade volume for this calculation.  
    inputs = [USEquityPricing.close, USEquityPricing.volume]  
    window_length = 1  
    # Dollar volume is volume * closing price.  
    def compute(self, today, assets, out, close, volume):  
        out[:] = (close[-1] * volume[-1])  
def initialize(context):  
    """  
    Called once at the start of the algorithm.  
    """  
    # User defined variables  
    context.leverage_long = 0.5  
    context.leverage_short = 0.5  
    context.long_number = 10  
    context.short_number = 10  
    # Create and attach pipeline  
    pipe = Pipeline()  
    attach_pipeline(pipe,name='pipeline')  
    # Add dividend yield to pipeline  
    dividend_yield = DividendYield()  
    pipe.add(dividend_yield, 'dividend_yield')  
    dy_filter = (dividend_yield > 2) & (dividend_yield < 6)  
    # Add price to book to pipeline  
    debt_ratio = Debt_Ratio()  
    pipe.add(debt_ratio,'debt_ratio')  
    dr_filter = (debt_ratio > 0)  
    # Add price to book to pipeline  
    #valuation = Valuation()  
    #pipe.add(valuation,'valuation')  
    #v_filter = (valuation < 20) & (valuation > -10)  
    # Create the dollar_volume factor using default inputs and window_length  
    dollar_volume = DollarVolume()  
    dollar_filter = (dollar_volume > 5000000)  
    pipe.add(dollar_volume, 'dollar_volume')  
    #pipe.set_screen(dy_filter & dollar_filter & dr_filter)  
    pipe.set_screen(dollar_filter)

    # Rebalance every day, 1 hour after market open.  
    schedule_function(rebalance, date_rules.month_start(), time_rules.market_open(hours=1))  
8 responses

Hi Jeppe,

Thanks for posting. At this point, accessing fundamental data through Pipeline can be pretty slow, especially if you are using a long window_length or using many BoundColumns. I'm sorry this is causing you problems.

For now I would advise that you try to minimize fundamentals use if you want improved performance. We're aware of the performance issue, and it's on our radar to improve.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi Nathan,
I'm trying to call just two Morningstar fields, one from balance_sheet and one from cash_flow_statement, with a window_length of 1, and my backtest is timing out. Here is a dumbed down test I did to see what would happen in the simplest circumstances:

def make_pipeline():  
    """  
    A function to create our dynamic stock selector (pipeline). Documentation on  
    pipeline can be found here: https://www.quantopian.com/help#pipeline-title  
    """  
    # Create a dollar volume factor.  
    dollar_volume = AverageDollarVolume(window_length=1)  
    # Pick the top 1% of stocks ranked by dollar volume.  
    high_dollar_volume = dollar_volume.percentile_between(99, 100)  
   # The morning star factors, wrapped in CustomFactor classes  
    factor1 = Factor1(mask=high_dollar_volume)  
    factor2 = Factor2(mask=high_dollar_volume)  
    debt_cash = (factor1 < 0) & (factor2 > 0)  
    pipe = Pipeline(  
        screen = debt_cash,  
        columns = {  
            'dollar_volume': dollar_volume  
        }  
    )  
    return pipe  

Any recommendations on how to proceed? I'd really like to screen on those two factors!

Two factors should work... are you sure your problem is with the pipeline?

Sunil

Hi Sunil,
I'm reasonably certain. The TimeoutException occurs when I call:

context.output = pipeline_output('my_pipeline')  

in my rebalance function, which I have running on a monthly basis.

Can you think of something else it could be?

Hi Spencer,

You say you're running pipeline_output on a monthly basis. That would imply that it's running in a scheduled function, as opposed to before_trading_start.

It actually turns out that pipelines are always run every day, no matter whether pipeline_output is called or not. There's no performance benefit to limiting pipeline_output calls to once a month.

However, if you run pipeline_output in a scheduled function, the running time of the pipeline will count toward the time limit of your scheduled functions. Meanwhile, if you run it in before_trading_start, the pipeline's running time counts towards before_trading_start's separate time limit.

Since your scheduled functions are supposed to take place during a specific minute, their running time is limited to about a minute. But before_trading_start is allowed five minutes before it triggers a timeout. This is done specifically so that you can run pipeline_output in before_trading_start.

The bottom line is, you can move pipeline_output to before_trading_start in order to avoid the timeout, and you won't suffer a performance loss.

Thanks, seems to work now. Would performance be better if I called the get_fundementals() function on an as need basis in either function? Seems like a waste to recompute the pipeline every day when I only look at it occasionally...

I know this is a few months old now, but I was thinking the same thing as Spencer. I only need to access monthly values of the pipeline for factor calculations. Is there a reason why you can't specify a frequency for running the pipeline?

Also, it would be good in the research environment for the run_pipeline function to be able to set frequency. You are pulling down daily data to then refactor it into monthly data. It makes it really easy to crash your research env