Notebook

Non-US Factor Example

This notebook constructs an example pipeline on German equities containing factors from both pricing and various FactSet datasets. A custom-defined factor is then analyzed with the Alphalens API and used to construct an example target portfolio.

1. Define Factors and Create Pipeline

In [1]:
# Import various Dataset Libraries:
from quantopian.pipeline.data import EquityPricing
from quantopian.pipeline.data.factset import Fundamentals, GeoRev, RBICSFocus, EquityMetadata

# Import 2 built-in factors for returns and average dollar volume:
from quantopian.pipeline.factors import Returns, AverageDollarVolume
In [2]:
# 1-day returns:
returns = Returns(window_length=2)

CustomFactors allow you to build your own factors. Below is an example using short-term momentum:

In [3]:
# Import CustomFactor:
from quantopian.pipeline.factors import CustomFactor

# Creating CustomFactor for short-term momentum:
class st_mom(CustomFactor):
    inputs=[EquityPricing.close]
    window_length = 22

    def compute(self, today, assets, out, price):
        out[:] = (price[-1] - price[0]) / price[0]
        
# CustomFactor using short-term momentum:
st_momentum = st_mom()

Below are additional factors created from FactSet's Fundamentals, RBICS Focus, Geographic Revenue, and Equity Metadata datasets:

In [4]:
# Annual sales factor from FS Fundamentals converted to USD with currency conversion:
annual_sale_usd = Fundamentals.sales_af.fx('USD').latest

# RBICS sector classification factor:
sector = RBICSFocus.l2_name.latest
    
# Create country-level revenue exposure DataSets by slicing GeoRev:
GeoRevUS = GeoRev.slice('US')

# Create estimated revenue percentage factors for the geographic locations referenced above:
rev_exposure_US = GeoRevUS.est_pct.latest

# Show listing currency:
listing_currency = EquityMetadata.listing_currency.latest
Note: Use Self-Serve Data to compose other factors from custom data.

An example Pipeline, set to the German equities domain, is defined below for most of the factors defined above:

In [5]:
# Import libraries needed to run Pipeline as well as chosen equity domain (German equities):
from quantopian.pipeline.domain import DE_EQUITIES
from quantopian.pipeline import Pipeline
from quantopian.research import run_pipeline
In [31]:
# Define our pipeline output with the factors (defined above):
pipe = Pipeline(
    columns={
        'returns': returns,
        'annual_sales_usd': annual_sale_usd,
        'st_momentum': st_momentum,
        'sector': sector,
        #'rev_exposure_DE': rev_exposure_US,
        'listing_currency': listing_currency
    },
    domain=DE_EQUITIES, # Set the pipeline domain to equities traded on German exchanges
    screen=AverageDollarVolume(inputs=[EquityPricing.close.fx('EUR'), EquityPricing.volume], window_length=20) > 1000000 # Screen out equities of low trading volumes
)
In [32]:
# Set the dates that the pipeline should run between:
start_date = '2016-01-01'
end_date = '2016-10-01'
In [33]:
# Run the pipeline:
results = run_pipeline(
    pipe,
    start_date= start_date,
    end_date= end_date
)
results.head()

Pipeline Execution Time: 28.60 Seconds
Out[33]:
annual_sales_usd listing_currency returns sector st_momentum
2016-01-04 00:00:00+00:00 Equity(1178883552065623 [RAA]) 6.010794e+08 EUR 0.021655 Industrial Manufacturing 0.066548
Equity(1178896686004567 [RIO1]) 4.348258e+10 EUR 0.003358 Mining and Mineral Products -0.137566
Equity(1178961330058052 [CBK]) 2.037291e+10 EUR -0.011565 Banking -0.073827
Equity(1178965609044310 [C012]) NaN EUR -0.002277 None -0.036758
Equity(1178969987363650 [FRE]) 2.838727e+10 EUR -0.007671 Healthcare Services -0.052019
In [34]:
results.listing_currency.unique()
Out[34]:
[EUR]
Categories (1, object): [EUR]

2. Evaluate Factors with Alphalens

Alphalens is a Quantopian open source library for performance analysis of predictive (alpha) factors. The main function of Alphalens is to surface the most relevant statistics and plots about an alpha factor.

In [12]:
# Import Alphalens and pandas libraries:
import alphalens as al
import pandas as pd
In [13]:
# Define and run a Pipeline for 1-day returns data:
returns_pipe = Pipeline(
    columns={
        '1D': Returns(window_length=2),
    },
    domain=DE_EQUITIES,
)
returns_data = run_pipeline(returns_pipe, '2016-01-01', '2016-10-01')

# Convert backward-looking returns into a forward-returns series:
shifted_returns = al.utils.backshift_returns_series(returns_data['1D'], 2)
al_returns = pd.DataFrame(
    data=shifted_returns, 
    index=results.index,
    columns=['1D'],
)
al_returns.index.levels[0].name = "date"
al_returns.index.levels[1].name = "asset"

# Print both returns and shifted returns for asset RAA:
display(returns_data.xs(symbols(1178883552065623), level=1).head(5), 
        al_returns.xs(symbols(1178883552065623), level=1).head(5)
       )

Pipeline Execution Time: 0.21 Seconds
1D
2016-01-04 00:00:00+00:00 0.021655
2016-01-05 00:00:00+00:00 -0.021672
2016-01-06 00:00:00+00:00 0.014971
2016-01-07 00:00:00+00:00 -0.016309
2016-01-08 00:00:00+00:00 -0.015970
1D
date
2016-01-04 00:00:00+00:00 0.014971
2016-01-05 00:00:00+00:00 -0.016309
2016-01-06 00:00:00+00:00 -0.015970
2016-01-07 00:00:00+00:00 -0.021184
2016-01-08 00:00:00+00:00 0.001898
In [14]:
# Format the factor and forward returns data into a format suitable for Alphalens functions:
al_data = al.utils.get_clean_factor(
    results['st_momentum'],
    al_returns,
    quantiles=5,
)
Dropped 1.0% entries from factor data: 1.0% in forward returns computation and 0.0% in binning phase (set max_loss=0 to see potentially suppressed Exceptions).
max_loss is 35.0%, not exceeded: OK!
In [15]:
# Create a full alphalens tearsheet:
from alphalens.tears import create_full_tear_sheet
create_full_tear_sheet(al_data)
Quantiles Statistics
min max mean std count count %
factor_quantile
1 -0.535955 0.052093 -0.083567 0.067376 10501 20.156244
2 -0.160060 0.109047 -0.023866 0.045698 10381 19.925909
3 -0.122508 0.136379 -0.001388 0.045561 10379 19.922070
4 -0.100169 0.175337 0.025085 0.044693 10383 19.929748
5 -0.061736 1.137276 0.105359 0.088942 10454 20.066029
Returns Analysis
1D
Ann. alpha -0.054
beta -0.243
Mean Period Wise Return Top Quantile (bps) -1.876
Mean Period Wise Return Bottom Quantile (bps) 3.925
Mean Period Wise Spread (bps) -5.801
<matplotlib.figure.Figure at 0x7f9e647f8ba8>
Information Analysis
1D
IC Mean -0.012
IC Std. 0.233
Risk-Adjusted IC -0.050
t-stat(IC) -0.685
p-value(IC) 0.494
IC Skew 0.037
IC Kurtosis -0.329
/venvs/py35/lib/python3.5/site-packages/statsmodels/nonparametric/kdetools.py:20: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  y = X[:m/2+1] + np.r_[0,X[m/2+1:],0]*1j
/venvs/py35/lib/python3.5/site-packages/alphalens/utils.py:912: UserWarning: Skipping return periods that aren't exact multiples of days.
  + " of days."
Turnover Analysis
1D
Quantile 1 Mean Turnover 0.180
Quantile 2 Mean Turnover 0.386
Quantile 3 Mean Turnover 0.405
Quantile 4 Mean Turnover 0.374
Quantile 5 Mean Turnover 0.169
1D
Mean Factor Rank Autocorrelation 0.924

3. Turning Pipeline Factors into an Optimized Portfolio

The Optimize API makes it easy to turn the output of a pipeline into an objective and a set of constraints. The order_optimal_portfolio function can be used to transition a current portfolio to a target portfolio that satisfies specifications.

This example constructs an example target portfolio from the st_momentum factor of the pipeline results with a chosen objective and set of constraints.

In [16]:
# Import the Optimize API:
import quantopian.optimize as opt
In [17]:
# Retrieve a particular day's worth of pipeline data on which we'd like to 
# calculate an optimal portfolio:
date = '2016-08-30'
results_date_to_optimize = results.loc[date].dropna()
In [18]:
# Define the MaximizeAlpha Objective function:
objective = opt.MaximizeAlpha(results_date_to_optimize.st_momentum)
In [19]:
# Define threshold values to set constraints:
MAX_GROSS_LEVERAGE = 1.0
MAX_POSITION_WEIGHT = .025

# Set max gross exposure, dollar neutral, position concentration constraints with threshold values:
max_gross = opt.MaxGrossExposure(MAX_GROSS_LEVERAGE)
dollar_neutral = opt.DollarNeutral()
position_concentration = opt.PositionConcentration.with_equal_bounds(
    -MAX_POSITION_WEIGHT, 
    MAX_POSITION_WEIGHT
)

constraints = [max_gross, 
               dollar_neutral, 
               position_concentration]

The objective and list of constraints can be passed to the calculate_optimal_portfolio function to calculate a target portfolio - a Series containing weights that maximize the objective without violating any constraints.

In [20]:
portfolio_weights = opt.calculate_optimal_portfolio(objective, constraints)
portfolio_weights.head()
Out[20]:
date                       asset                          
2016-08-30 00:00:00+00:00  Equity(1178883552065623 [RAA])     0.0
                           Equity(1178961330058052 [CBK])     0.0
                           Equity(1178969987363650 [FRE])     0.0
                           Equity(1178995706580306 [FNTN])    0.0
                           Equity(1179021207549272 [SMHN])    0.0
dtype: float64
In [21]:
# Constraint report:
print " Gross exposure: {:.2f}".format(portfolio_weights.abs().sum())
print " Net exposure: {:.2f}\n".format(portfolio_weights.sum())
print " Largest long position: {:.3f}".format(portfolio_weights.max())
print " Largest short position: {:.3f}".format(portfolio_weights.min())
print " Number of Positions: {:}\n".format(portfolio_weights[portfolio_weights != 0].count())
 Gross exposure: 1.00
 Net exposure: 0.00

 Largest long position: 0.025
 Largest short position: -0.025
 Number of Positions: 41

In [ ]: