Custom Pipeline factors beginner

Back to Community

edited Oct 16, 2015

Hi Everyone,

I was thinking about implementing something like that in Pipeline (According to what I understood, I think Pipeline would be the way to go for this algo), but I am totally stuck as I am rather new to Quantopian and Python:

Compute 2 mean reversion factors:
-Compute the 4 weeks industry relative return (for instance over the 11 Morningstar sectors)
-Percent above 100 days low with a 10 days lag

Compute 2 momentum factors:

-Compute the moving average over a 60 days window with a 10 days lag (i.e. say we are 16 October, I'd like to compute the the MA over 60 days until 2 October (10 trading days before 16 October).

For this one I tried something like that but could make it work:

class Factor1(CustomFactor):  
    # Pre-declare inputs and window_length  
    inputs = [USEquityPricing.close, ]  
    window_length = 70  
    # Compute factor1 value  
    def compute(self, today, assets, out, close):  
        x=close[0:59]  
        out[:] = x.apply(talib.SMA, timeperiod=60)

-compute the slope of the 200 days trend line with a 10 days lag

Then I would like to rank according to a "z-score" for each factor (z-score= (Factor_value_of_stock_i-Average_Factor_value)/(Std dev of Factor value)). In the end, I would actually like to compute a "double layer" score, first over each factor category (momentum/mean reversion) in order to put more weight on one factor over the other within the category (for instance score_mean_reversion=0.7*score_industry_relative_return+0.3*Percent_above_100_days_low), and then over the 2 categories (total score=.6*score_momentum+.4*score_mean_reversion)

Does anyone have an idea about how to implement that in Pipeline (is it possible?). My ultimate goal would be to add a few more factors (say try with some fundamentals as well).

Hope my explanation is clear enough. Thanks in advance for any help.

2 responses

Scott Sanderson

Oct 23, 2015

For zscoring your factors, you probably want to use scipy.stats.mstats.zscore before writing your outputs in compute. For your examples of "60Day MAVG, lagged 10 days" you might get something like this:

from numpy import nanmean, isnan, nan  
from scipy.stats.mstats import zscore

class LaggedSMAZScore(CustomFactor):  
    inputs = [USEquityPricing.close]  
    window_length =70

    def compute(self, today, assets, out, close):  
        avgs = nanmean(closes[0:60], axis=0)  # axis=0 means aggregate over columns

        # Handling nans correctly like this shouldn't be necessary for `nanmean`, because it will always produce  
        # a non-nan value for every column.  For other aggregations though, you probably want to ensure that you're  
        # only computing a zscore on non-nan values.  
        nans = isnan(avgs)  
        notnan = ~nans  
        out[notnan] = zscore(avgs[notnan])  
        out[nans] = nan

You could do something similar for any other CustomFactors you write. In the relatively near term I'm planning on adding zscore as a method to CustomFactor so that, for any Factor you write, you can do MyFactor().zscore(), which would give you the zscore of that factor over the full universe of assets.

A more interesting generalization after that will be allowing something like MyFactor.zscore(groups=SomeClassifier()), which would allow things like zscoring by industry, or by percentile buckets of some other Factor.

As far as ranking by a linear combination of some number of Factors goes, that should all just work the way you'd want if you just treat the Factor objects like numbers or ndarrays. For example, the result of SomeFactor1() + SomeFactor2() is a new factor that computes the sum of its inputs. Pretty much all the mathematical operators work, and you can mix in scalars as well. We use numexpr under the hood for these calculations, which means that simple arithmetic expressions like this are generally faster than the equivalent numpy code. All the gory details of how this works can be found here.

class SomeFactor1(CustomFactor):  
    ...

class SomeFactor2(CustomFactor):  
    ...

class SomeFactor3(CustomFactor):  
    ...


weighted_rank = ((0.3 * SomeFactor1()) + (0.5 * SomeFactor2()) + (0.4 * SomeFactor3())).rank()=

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

laurent keller

Oct 25, 2015

Hi Scott,

Thanks a lot for your help.

You've successfully submitted a support ticket.

Our support team will be in touch soon.