Notebook

21 day returns of a given factor

First import things we need...

In [6]:
# Pipeline imports
from quantopian.research import run_pipeline
from quantopian.pipeline import Pipeline, CustomFactor
from quantopian.pipeline.factors import Returns
from quantopian.pipeline.filters import QTradableStocksUS

# Import any datasets we need
from quantopian.pipeline.data import USEquityPricing

Define the factor one want's to analyze

In [5]:
# Define some arbitrary factor
class Momentum(CustomFactor):  
    """ Conventional Momentum factor """  
    inputs = [USEquityPricing.close]  
    window_length = 250  
    def compute(self, today, assets, out, prices):  
        out[:] = (prices[-20] - prices[0])/prices[0]  

Set the dates over which one wants to analyze the factor

In [9]:
pipe_start_date = '2018-1-1'
pipe_end_date = '2019-1-1'

# Select an end date after the pipeline end date to calculate forward returns
# Should be at least 21 days if one wants to calculate 21 day returns.
pricing_end_date = '2019-2-15'

Step 1 is to get our factor data from a pipeline. There will be a factor value for each security and each day in our pipeline. We'll use that as the factor we want to analyze.

In [11]:
# Pipeline definition
def make_pipeline():
    my_universe = QTradableStocksUS()
    my_factor = Momentum(mask=my_universe)

    return Pipeline(
        columns={
            'my_factor' : my_factor,
        },
        screen=my_universe
    )

# Pipeline execution
data_output = run_pipeline(
    make_pipeline(),
    start_date=pipe_start_date,
    end_date=pipe_end_date
)

Pipeline Execution Time: 1.12 Seconds

Step 2 is to get some actual pricing data to calculate returns from. Make sure to get pricing data for all assets and for at least as long as the pipeline output.

In [14]:
# Set the factor we want to analyze. Must be a single column from the pipeline dataframe.
factor_data = data_output.my_factor

pricing_data = get_pricing(
  symbols=factor_data.index.levels[1], # Finds all assets that appear at least once in "factor_data"  
  start_date=pipe_start_date,
  end_date=pricing_end_date, # must be after run_pipeline()'s end date. Explained more in lesson 4
  fields='open_price' # Generally, you should use open pricing. Explained more in lesson 4
)

# Show the first 5 rows of pricing_data
pricing_data.head(5)
Out[14]:
Equity(2 [ARNC]) Equity(21 [AAME]) Equity(24 [AAPL]) Equity(25 [ARNC_PR]) Equity(31 [ABAX]) Equity(39 [DDC]) Equity(41 [ARCB]) Equity(52 [ABM]) Equity(53 [ABMD]) Equity(62 [ABT]) ... Equity(52734 [RBZA_W]) Equity(52735 [RBZ]) Equity(52736 [GLU_PRB]) Equity(52737 [SIRV_V]) Equity(52738 [ILPT_V]) Equity(52744 [AMCI_W]) Equity(52745 [AMCI]) Equity(52746 [PHCF]) Equity(52747 [DELL]) Equity(52749 [ERSX])
2018-01-02 00:00:00+00:00 26.917 3.439 166.928 86.500 49.347 NaN 35.206 36.856 188.13 56.889 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2018-01-03 00:00:00+00:00 27.212 3.473 169.253 87.662 49.845 NaN 36.741 36.783 193.29 57.661 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2018-01-04 00:00:00+00:00 27.794 3.329 169.262 88.369 49.477 NaN 36.395 36.969 198.00 58.159 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2018-01-05 00:00:00+00:00 28.829 3.230 170.145 88.369 50.791 NaN 36.395 37.654 200.78 57.710 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2018-01-08 00:00:00+00:00 29.253 3.578 171.038 NaN 54.872 NaN 35.751 38.163 208.24 57.524 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 10199 columns

Step 3 is to calculate the returns from this pricing data and merge it with the pipeline output data. There is a convenience method one can use in Alphalens called get_clean_factor_and_forward_returns.

This where one specifies the forward returns to compute. In this case 5, 10, and 21 day returns.

In [16]:
from alphalens.utils import get_clean_factor_and_forward_returns

merged_data = get_clean_factor_and_forward_returns(
  factor=factor_data, 
  prices=pricing_data,
  periods=(5, 10, 21)
)

# Show the first 5 rows of merged_data
merged_data.head(5)
Dropped 1.2% entries from factor data: 1.2% in forward returns computation and 0.0% in binning phase (set max_loss=0 to see potentially suppressed Exceptions).
max_loss is 35.0%, not exceeded: OK!
Out[16]:
5D 10D 21D factor factor_quantile
date asset
2018-01-02 00:00:00+00:00 Equity(2 [ARNC]) 0.091169 0.113497 0.091504 0.256587 4
Equity(24 [AAPL]) 0.025796 0.035201 -0.017606 0.497497 5
Equity(31 [ABAX]) 0.090178 0.266500 0.440392 -0.087789 2
Equity(41 [ARCB]) 0.029540 0.019684 -0.009828 0.284247 4
Equity(52 [ABM]) 0.051362 0.026699 0.007841 0.042179 2

Step 4 is to run any of the many Alphalens analysis tools. Use the pipeline output from get_clean_factor_and_forward_returns output as the input. That's it!

In [17]:
from alphalens.tears import create_full_tear_sheet

create_full_tear_sheet(merged_data)
Quantiles Statistics
min max mean std count count %
factor_quantile
1 -0.999819 -0.018291 -0.251968 0.143810 108315 20.019666
2 -0.244648 0.113184 -0.021179 0.060754 108152 19.989539
3 -0.096816 0.264923 0.108139 0.063958 108159 19.990833
4 0.021100 0.508047 0.268832 0.085485 108152 19.989539
5 0.176027 30.496063 0.870324 0.978012 108265 20.010424
Returns Analysis
5D 10D 21D
Ann. alpha 0.063 0.047 0.037
beta 0.150 0.109 0.077
Mean Period Wise Return Top Quantile (bps) 10.531 8.821 6.117
Mean Period Wise Return Bottom Quantile (bps) -3.596 -1.666 0.073
Mean Period Wise Spread (bps) 14.127 10.476 6.043
<matplotlib.figure.Figure at 0x7f79eebe9310>
Information Analysis
5D 10D 21D
IC Mean 0.014 0.007 -0.005
IC Std. 0.137 0.131 0.112
Risk-Adjusted IC 0.101 0.051 -0.048
t-stat(IC) 1.607 0.817 -0.761
p-value(IC) 0.109 0.415 0.448
IC Skew -0.350 -0.366 -0.229
IC Kurtosis -0.524 -0.568 -0.535
Turnover Analysis
10D 21D 5D
Quantile 1 Mean Turnover 0.158 0.238 0.109
Quantile 2 Mean Turnover 0.340 0.459 0.248
Quantile 3 Mean Turnover 0.382 0.504 0.283
Quantile 4 Mean Turnover 0.321 0.436 0.232
Quantile 5 Mean Turnover 0.142 0.216 0.097
5D 10D 21D
Mean Factor Rank Autocorrelation 0.975 0.951 0.902
In [ ]: