First import things we need...
# Pipeline imports
from quantopian.research import run_pipeline
from quantopian.pipeline import Pipeline, CustomFactor
from quantopian.pipeline.factors import Returns
from quantopian.pipeline.filters import QTradableStocksUS
# Import any datasets we need
from quantopian.pipeline.data import USEquityPricing
Define the factor one want's to analyze
# Define some arbitrary factor
class Momentum(CustomFactor):
""" Conventional Momentum factor """
inputs = [USEquityPricing.close]
window_length = 250
def compute(self, today, assets, out, prices):
out[:] = (prices[-20] - prices[0])/prices[0]
Set the dates over which one wants to analyze the factor
pipe_start_date = '2018-1-1'
pipe_end_date = '2019-1-1'
# Select an end date after the pipeline end date to calculate forward returns
# Should be at least 21 days if one wants to calculate 21 day returns.
pricing_end_date = '2019-2-15'
Step 1 is to get our factor data from a pipeline. There will be a factor value for each security and each day in our pipeline. We'll use that as the factor we want to analyze.
# Pipeline definition
def make_pipeline():
my_universe = QTradableStocksUS()
my_factor = Momentum(mask=my_universe)
return Pipeline(
columns={
'my_factor' : my_factor,
},
screen=my_universe
)
# Pipeline execution
data_output = run_pipeline(
make_pipeline(),
start_date=pipe_start_date,
end_date=pipe_end_date
)
Step 2 is to get some actual pricing data to calculate returns from. Make sure to get pricing data for all assets and for at least as long as the pipeline output.
# Set the factor we want to analyze. Must be a single column from the pipeline dataframe.
factor_data = data_output.my_factor
pricing_data = get_pricing(
symbols=factor_data.index.levels[1], # Finds all assets that appear at least once in "factor_data"
start_date=pipe_start_date,
end_date=pricing_end_date, # must be after run_pipeline()'s end date. Explained more in lesson 4
fields='open_price' # Generally, you should use open pricing. Explained more in lesson 4
)
# Show the first 5 rows of pricing_data
pricing_data.head(5)
Step 3 is to calculate the returns from this pricing data and merge it with the pipeline output data. There is a convenience method one can use in Alphalens called get_clean_factor_and_forward_returns.
This where one specifies the forward returns to compute. In this case 5, 10, and 21 day returns.
from alphalens.utils import get_clean_factor_and_forward_returns
merged_data = get_clean_factor_and_forward_returns(
factor=factor_data,
prices=pricing_data,
periods=(5, 10, 21)
)
# Show the first 5 rows of merged_data
merged_data.head(5)
Step 4 is to run any of the many Alphalens analysis tools. Use the pipeline output from get_clean_factor_and_forward_returns output as the input. That's it!
from alphalens.tears import create_full_tear_sheet
create_full_tear_sheet(merged_data)