Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
beta confusion question

Hi,

I am getting 2 different beta values when in the pipeline when using

regression = RollingLinearRegressionOfReturns(  
        target=symbols('SPY'),  
        returns_length=2,  
        regression_length=100,  
        mask=universe,  
    )

and

the following custom factor that I found on the site (thanks to whoever provided it)

def _beta(ts, benchmark, benchmark_var):  
    return np.cov(ts, benchmark)[0, 1] / benchmark_var 

class Beta(CustomFactor):  
    inputs = [USEquityPricing.close]  
    window_length = 100  
    def compute(self, today, assets, out, close):  
        returns = pd.DataFrame(close, columns=assets).pct_change()[1:]  
        spy_returns = returns[symbols('SPY')]  
        spy_returns_var = np.var(spy_returns)  
        out[:] = returns.apply(_beta, args=(spy_returns,spy_returns_var,))


the differences are always there and sometimes massive when I print out the 2 betas for instance, when I run the pipeline for

result = run_pipeline(make_pipeline(), '2017-06-19', '2017-06-19')

I get an entry for DLTR with one beta at 4.121959 and the other at 1.633366

So i'm confused, could anyone shed any light on this ?

thanks

3 responses

The custom factor you found on the site is wrong. The problem with the custom factor is, by default, 'numpy.cov' calculates the sample covariance. To obtain the population covariance (normalising by the total N samples) set the parameter bias = True. So, the statements in the custom factor

return np.cov(ts, benchmark)[0, 1] / benchmark_var 

should be

# Set bias to True. Remove this and notice that the SPY betas are not correct (should be 1.0)  
return np.cov(ts, benchmark, bias=True)[0, 1] / benchmark_var 

One other little issue is that the window length in the custom factor should be 101 and not 100. The built in factor uses 100 days of returns which therefore needs 101 days of prices. See the attached notebook with this new custom factor in action and how it compares to the built in factor. You could look on zipline to see how the builtin beta is calculated if you're interested.

Hi Dan,

many thanks for this explanation. I guess it's a case of caveat emptor when looking at code on the site, there are some great ideas coded but I can't assume that they are always coded correctly.

Hi Dan,

when I try your notebook in an algo I get an error

ValueError: all the input array dimensions except for the concatenation axis must match exactly

here is the code




# Import the libraries we will use here.  
from quantopian.algorithm import attach_pipeline, pipeline_output  
from quantopian.pipeline import Pipeline, CustomFactor  
from quantopian.pipeline.data.builtin import USEquityPricing  
from quantopian.pipeline.factors import AverageDollarVolume, Returns, Latest, SimpleMovingAverage, RSI, BollingerBands, MovingAverageConvergenceDivergenceSignal, RollingLinearRegressionOfReturns  
from quantopian.pipeline.filters.morningstar import Q1500US, Q500US  
from quantopian.pipeline.filters import StaticAssets  
import numpy as np  
import pandas as pd

# Define our Custom Factor  
def _beta(ts, benchmark, benchmark_var):  
    # Set bias to true to get population covariance  
    return np.cov(ts, benchmark, bias=True)[0, 1] / benchmark_var 

class Beta(CustomFactor):  
    inputs = [USEquityPricing.close]  
    window_length = 101  
    def compute(self, today, assets, out, close):  
        returns = pd.DataFrame(close, columns=assets).pct_change()[1:]  
        spy_returns = returns[symbols('SPY')]  
        spy_returns_var = np.var(spy_returns)  
        out[:] = returns.apply(_beta, args=(spy_returns,spy_returns_var,))  

def initialize(context):  
    """  
    Called once at the start of the program.  
    """  
    # Create and attach our pipeline (dynamic stock selector), defined below.  
    attach_pipeline(make_pipeline(context), 'trend_example')  
    schedule_function(rebalance,  
                      date_rules.week_start(days_offset=0),  
                      time_rules.market_open(minutes=5))

def before_trading_start(context, data):  
    """  
    Called every day before market open. This is where we get the securities  
    that made it through the pipeline.  
    """  
    pass

def make_pipeline(context):  
    spy = symbol('SPY')

    # Make a spy filter  
    spy_filter = StaticAssets([spy])  
    universe = Q1500US() | spy_filter

    recent_returns = Returns(window_length=101, mask=universe)  
    beta = Beta(mask=universe)  
    # Define a column dictionary that holds all the Factors  
    pipe_columns = {  
            'recent_returns':recent_returns,  
            'beta':beta,  
           }

    # Create a pipeline object with the defined columns and screen.  
    pipe = Pipeline(columns=pipe_columns,screen=universe)

    return pipe

def setup_context(context):

    context.output = pipeline_output('trend_example').dropna()  
    context.long_secs = context.output[(context.output['recent_returns']>0)]  
    context.short_secs = context.output[(context.output['recent_returns']<0)]  
    # A list of the securities that we want to order today.  
    context.security_list = context.long_secs.index.union(context.short_secs.index).tolist()

    # A set of the same securities, sets have faster lookup.  
    context.security_set = set(context.security_list)  

def compute_weights(context):  
    # Set the allocations to even weights for each long position, and even weights  
    # for each short position.  
    long_weight = 0  
    if len(context.long_secs) > 0 :  
        long_weight = context.long_leverage / len(context.long_secs)  
    short_weight = 0  
    if len(context.short_secs) > 0 :  
        short_weight = context.short_leverage / len(context.short_secs)  
    return long_weight, short_weight  


def rebalance(context,data):  
    setup_context(context)

    long_weight, short_weight = compute_weights(context)

    for stock in context.security_list:  
        if data.can_trade(stock):  
            if stock in context.long_secs.index:  
                order_target_percent(stock, long_weight)  
            elif stock in context.short_secs.index:  
                order_target_percent(stock, short_weight)