by Gil Wassermann
The goal of this project is to create a universe of the most tradeable securities with a view to optimizing pipeline performance and reducing noisy data casued by untradeable assets. If a robust, tradeable universe can be established, users will be able to create better, more reliable algorithms.
A first pass of this process is completed in a series of steps:
After this initial universe is created, securities are only removed if they fail to meet the tradeability filter. If a stock is removed, it is proposed to be replaced by the most liquid stock that passes the tradeability filter that is not in the universe. After a stock is proposed in this manner, it is checked to see that it does not surpass the sector exposure limit. If not, the stock is added to the universe; if so, the next most liquid stock is proposed.
The create_tradeable
method allows you to customize both the number of desired securities in the universe as well as the the sector exposure threshold. The former allows you to create a Tradeable500US, Tradeable1500US etc. while the latter allows you to set a target percentage to limit the influence of particular industry groups in the alpha generation process. Included in this notebook are some graphics to observe sector exposures.
The filters used are:
To remain sector neutral, we create a filter that only allows us to retrieve the maximum number of equities per sector (given by the sector_exposure_limit
) and then we take the tradeable_count
most liquid assets in the past month from this list.
More information about the filter process can be found here:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import math
from datetime import timedelta, date
from quantopian.research import run_pipeline
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.factors import AverageDollarVolume, CustomFactor, Latest
from quantopian.pipeline.filters.morningstar import IsPrimaryShare
from quantopian.pipeline.data import morningstar as mstar
from quantopian.pipeline.classifiers.morningstar import Sector
# Constants that need to be global
COMMON_STOCK= 'ST00000001'
SECTOR_NAMES = {
101: 'Basic Materials',
102: 'Consumer Cyclical',
103: 'Financial Services',
104: 'Real Estate',
205: 'Consumer Defensive',
206: 'Healthcare',
207: 'Utilities',
308: 'Communication Services',
309: 'Energy',
310: 'Industrials',
311: 'Technology' ,
}
# Average Dollar Volume without nanmean, so that recent IPOs are truly removed
class ADV_adj(CustomFactor):
inputs = [USEquityPricing.close, USEquityPricing.volume]
window_length = 252
def compute(self, today, assets, out, close, volume):
close[np.isnan(close)] = 0
out[:] = np.mean(close * volume, 0)
def universe_filters():
"""
Create a Pipeline producing Filters implementing common acceptance criteria.
Returns
-------
zipline.Filter
Filter to control tradeablility
"""
# Equities with an average daily volume greater than 750000.
high_volume = (AverageDollarVolume(window_length=252) > 750000)
# Not Misc. sector:
sector_check = Sector() != -1.
# Equities that morningstar lists as primary shares.
# NOTE: This will return False for stocks not in the morningstar database.
primary_share = IsPrimaryShare()
# Equities for which morningstar's most recent Market Cap value is above $300m.
have_market_cap = mstar.valuation.market_cap.latest > 300000000
# Equities not listed as depositary receipts by morningstar.
# Note the inversion operator, `~`, at the start of the expression.
not_depositary = ~mstar.share_class_reference.is_depositary_receipt.latest
# Equities that listed as common stock (as opposed to, say, preferred stock).
# This is our first string column. The .eq method used here produces a Filter returning
# True for all asset/date pairs where security_type produced a value of 'ST00000001'.
common_stock = mstar.share_class_reference.security_type.latest.eq(COMMON_STOCK)
# Equities whose exchange id does not start with OTC (Over The Counter).
# startswith() is a new method available only on string-dtype Classifiers.
# It returns a Filter.
not_otc = ~mstar.share_class_reference.exchange_id.latest.startswith('OTC')
# Equities whose symbol (according to morningstar) ends with .WI
# This generally indicates a "When Issued" offering.
# endswith() works similarly to startswith().
not_wi = ~mstar.share_class_reference.symbol.latest.endswith('.WI')
# Equities whose company name ends with 'LP' or a similar string.
# The .matches() method uses the standard library `re` module to match
# against a regular expression.
not_lp_name = ~mstar.company_reference.standard_name.latest.matches('.* L[\\. ]?P\.?$')
# Equities with a null entry for the balance_sheet.limited_partnership field.
# This is an alternative way of checking for LPs.
not_lp_balance_sheet = mstar.balance_sheet.limited_partnership.latest.isnull()
# Highly liquid assets only. Also eliminates IPOs in the past 12 months
# Use new average dollar volume so that unrecorded days are given value 0
# and not skipped over
# S&P Criterion
liquid = ADV_adj() > 250000
# Add logic when global markets supported
# S&P Criterion
domicile = True
universe_filter = (high_volume & primary_share & have_market_cap & not_depositary &
common_stock & not_otc & not_wi & not_lp_name & not_lp_balance_sheet &
liquid & domicile)
return universe_filter
def sector_filters(tradeable_count, sector_exposure_limit):
"""
Mask for Pipeline in create_tradeable. Limits each sector so as not to be over-exposed
Parameters
----------
tradeable_count : int
Target number of constituent securities in universe
sector_exposure_limit: float
Target threshold for any particular sector
Returns
-------
zipline.Filter
Filter to control sector exposure
"""
# set thresholds
if sector_exposure_limit < ((1. / len(SECTOR_NAMES))):
threshold = int(math.ceil((1. / len(SECTOR_NAMES)) * tradeable_count))
elif sector_exposure_limit > 1.:
threshold = tradeable_count
else:
threshold = int(math.ceil(sector_exposure_limit * tradeable_count))
# retrieve sector codes
sector = Sector()
# for each sector create a filter of upper possible threshold
basic_trim = AverageDollarVolume(window_length=21).top(threshold, mask=sector.eq(101))
consumer_trim = AverageDollarVolume(window_length=21).top(threshold, mask=sector.eq(102))
financial_trim = AverageDollarVolume(window_length=21).top(threshold, mask=sector.eq(103))
re_trim = AverageDollarVolume(window_length=21).top(threshold, mask=sector.eq(104))
cd_trim = AverageDollarVolume(window_length=21).top(threshold, mask=sector.eq(205))
healthcare_trim = AverageDollarVolume(window_length=21).top(threshold, mask=sector.eq(206))
utilities_trim = AverageDollarVolume(window_length=21).top(threshold, mask=sector.eq(207))
comms_trim = AverageDollarVolume(window_length=21).top(threshold, mask=sector.eq(308))
energy_trim = AverageDollarVolume(window_length=21).top(threshold, mask=sector.eq(309))
industrials_trim = AverageDollarVolume(window_length=21).top(threshold, mask=sector.eq(310))
tech_trim = AverageDollarVolume(window_length=21).top(threshold, mask=sector.eq(311))
return basic_trim | consumer_trim | financial_trim | re_trim | cd_trim | healthcare_trim | \
utilities_trim | comms_trim | energy_trim | industrials_trim | tech_trim
# Method to create a tradeable universe of a certain size on a certain date
def create_tradeable(tradeable_count=500, sector_exposure_limit=0.15, date='2015-01-01'):
"""
Computes a given number of the most tradeable stocks and presents them as a tradeable universe.
Parameters
----------
tradeable_count : int
Target number of constituent securities in universe
sector_exposure_limit: float
Target threshold for any particular sector
date: string
YYYY-MM-DD for date on which to run the universe
Returns
-------
tradeable_secs : pd.Series
Equity objects of securities to be included in the TradeableUS universe.
"""
# create Pipeline
tradeable_pipe = Pipeline()
sector = Sector()
# add the monthly average dollar volume traded zscored between industry to maintain sector neutrality
tradeable_pipe.add(AverageDollarVolume(window_length=21), 'Liquidity')
# add filters to the pipe to weed out untradeable stocks
tradeable_filter = universe_filters()
sector_filter = sector_filters(tradeable_count, sector_exposure_limit)
tradeable_pipe.set_screen(tradeable_filter & sector_filter)
tradeable_pipe_results = run_pipeline(tradeable_pipe, date, date)
# if the desired number of securities is larger than the number of filtered securities, then just return
# filtered securities as this is the maximum number of tradeable equities in the entire stock universe
if len(tradeable_pipe_results.index.levels[1]) < tradeable_count:
return tradeable_pipe_results.index
else:
tradeable_pipe_results.sort('Liquidity', ascending=False)
tradeable_secs = pd.Series(tradeable_pipe_results.index.levels[1].get_values())
return tradeable_secs.head(tradeable_count)
def tradeable_sector_analysis(t_set, date):
"""
Quick visualization of sector exposures in the universe
Parameters
----------
t_set : pd.Series
Index of every constituent of universe
date: string
YYYY-MM-DD for date on which to run the analysis of the universe
"""
# run pipeline with sector and close price
pipe = Pipeline()
pipe.add(Latest(inputs=[USEquityPricing.close]), 'Close')
pipe.add(Sector(), 'Sector')
results = run_pipeline(pipe, date, date)
# get the results only for those in the tradeable universe
results.index = results.index.levels[1]
results = results.loc[t_set.as_matrix(),:]
# group data
sector_groups = results.groupby(by='Sector')
sector_counts = sector_groups.count()
xticks = [SECTOR_NAMES.get(i) for i in sector_counts.index]
# create bar chart of number of companies in each sector
ax_freq = sector_counts.plot(kind='bar', color='c')
ax_freq.set_xticklabels(xticks, rotation=45)
ax_freq.set_ylabel('Frequency')
ax_freq.set_title('Sector Frequencies')
ax_freq.legend().set_visible(False)
ax_prop = sector_counts.plot(kind='pie', subplots=True, labels=xticks, colormap='Blues')
ax_prop[0].set_ylabel('');
def update_universe(tradeable_0, tradeable_count, sector_exposure_limit, date, timedelta_days):
"""
Takes in one universe and returns another timedelta_days later
Parameters
----------
tradeable_0 : pd.Series
Equity objects of securities to be included in the TradeableUS universe
tradeable_count : int
Desired number of securities in universe
sector_exposure_limit : float
Target threshold for any particular sector
date : datetime
datetime object of date that tradeable_0 was run
timedelta_days :
interval until next update of universe
Returns
-------
turnover : float
For analysis purposes. Calculates what fraction of the universe
has changed between time periods
tradeable_1_index : pd.Series
Index of securities to be included in the TradeableUS universe got next time period
"""
# Run pipeline for next month
full_pipe = Pipeline()
full_pipe.add(AverageDollarVolume(window_length=21), 'Liquidity')
full_pipe.add(Sector(), 'Sector')
tradeable_filters = universe_filters()
full_pipe.set_screen(tradeable_filters)
full_results = run_pipeline(full_pipe, date +
timedelta(days=timedelta_days) , date + timedelta(days=timedelta_days))
# remove time component of multiindex
full_results.index = full_results.index.levels[1]
# get results in tradeable_0 in the next period
tradeable_0_results = full_results.loc[tradeable_0.tolist(),:]
# remove nan values, show up if tradeable_0 securities have fallen out of index
tradeable_0_results = tradeable_0_results.dropna()
# group by sector for sector neutrality threshold
tradeable_0_sector_counts = tradeable_0_results.groupby('Sector').count()
# get threshold
threshold = int(math.ceil(tradeable_count * sector_exposure_limit))
# list of securities to add ranked by liquidity
add_list = full_results.drop(tradeable_0_results.index.get_values().tolist())
add_list = add_list.sort('Liquidity', ascending=False)
# number of securities to add
to_add = tradeable_count - len(tradeable_0_results.index)
turnover = float(to_add) / float(tradeable_count)
# create variable for index values as list
tradeable_1_index = tradeable_0_results.index.get_values().tolist()
# loop through proposed index
for i in range(len(add_list.index)):
# if no more securities to add
if to_add == 0:
return turnover, pd.Series(tradeable_1_index)
# if addition would not break sector exposure limit
if (tradeable_0_sector_counts.loc[add_list.iloc[i]['Sector']]['Liquidity'] + 1) < threshold:
tradeable_1_index.append(add_list.iloc[i].name)
tradeable_0_sector_counts.loc[add_list.iloc[i]['Sector']]['Liquidity'] += 1
to_add -= 1
# if addition woulf break sector exposure limit
else:
continue
Let us look at the Tradeable500US and get a quick overview of its constituents. Then we will have a look at its turnover (the number of new equities in an update over the total number of equities in the universe).
tradeable_0 = create_tradeable(500, 0.2, '2015-01-01')
tradeable_sector_analysis(tradeable_0, '2015-01-01')
# create tradeable universe
tradeable500US = create_tradeable(500, 0.2, date(2003,1,1))
turnovers500US = []
# iterate over months
for month in (date(2003, 1, 1) + timedelta(days=30*n) for n in range(155)):
turnover, tradeable_next = update_universe(tradeable500US, 500, 0.2, month, 30)
tradeable500US = tradeable_next
turnovers500US.append(turnover)
# plot results
months = range(len(turnovers500US))
plt.plot(months, turnovers500US)
plt.axhline(np.mean(turnovers500US), color='r')
plt.title('Monthly Turnover Tradeable500US')
plt.xlabel('Months Elapsed')
plt.ylabel('Turnover');
As we can see above, our universe is not overweight any particular sector and the average turnover is less than 0.3%, which corresponds to one and a bit securities per month. Also. It should be noted that this spike occurs around 70 months after Jan 2003, which corresponds to late 2008 (the collapse of Lehman Brothers). Even in this unstable macroeconomic state, the universe only sees 1.4% turnover (7 securities).
tradeable_0 = create_tradeable(1500, 0.2, '2015-01-01')
tradeable_sector_analysis(tradeable_0, '2015-01-01')
# create tradeable universe
tradeable1500US = create_tradeable(1500, 0.2, date(2003,1,1))
turnovers1500US = []
# iterate over months
for month in (date(2003, 1, 1) + timedelta(days=30*n) for n in range(155)):
turnover, tradeable_next = update_universe(tradeable1500US, 1500, 0.2, month, 30)
tradeable1500US = tradeable_next
turnovers1500US.append(turnover)
# plot results
months = range(len(turnovers1500US))
plt.plot(months, turnovers1500US)
plt.axhline(np.mean(turnovers1500US), color='r')
plt.title('Monthly Turnover Tradeable1500US')
plt.xlabel('Months Elapsed')
plt.ylabel('Turnover');
Once again we see a spike in turnover during the financial crisis and an average turnover of well below 1%.
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory or other services by Quantopian.
In addition, the content of the website neither constitutes investment advice nor offers any opinion with respect to the suitability of any security or any specific investment. Quantopian makes no guarantees as to accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.