Notebook

Equity Metadata

Equity Metadata is a dataset that includes information about equities such as the security type, listing currency, exchange, and more. The dataset combines fields from FactSet Symbology and FactSet Fundamentals that are commonly used to build a tradable universe.

Currently, all contest algorithms are required to trade within the QTradableStocksUS universe, but non-US pipelines don't have an equivalent definition. The new Equity Metadata dataset allows you to make a tradable universe in non-US markets so that you can construct factors on a liquid set of equities with similar properties.

Equity Metadata is available via the Pipeline API, which means it can be accessed in Research and the IDE.

Dataset Overview

The EquityMetadata dataset has 5 fields (accessible as BoundColumn attributes):

  • security_type (dtype str): A string code representing the security type of the equity. This table enumerates the set of possible security types.
  • is_primary (dtype bool): Boolean flag denoting whether this is the primary issue of the security (True = primary issue, False = secondary issue).
  • listing_currency (dtype str): Currency in which the listing trades.
  • listing_exchange (dtype str): Code denoting the exchange on which the listing trades.
  • primary_fsym_security_id (dtype str): The primary fsym security ID of the equity. Note that multiple equities can have the same FactSet security ID as a particular security can have multiple listings across different regions.

The following cell constructs and runs a pipeline that gets the latest value for all available fields in the EquityMetadata dataset.

In [1]:
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.factset import EquityMetadata
from quantopian.pipeline.domain import US_EQUITIES
from quantopian.research import run_pipeline

pipe = Pipeline(
    columns={
        'is_primary': EquityMetadata.is_primary.latest,
        'listing_currency': EquityMetadata.listing_currency.latest,
        'listing_exchange': EquityMetadata.listing_exchange.latest,
        'security_type': EquityMetadata.security_type.latest,
        'primary_fsym_security_id': EquityMetadata.primary_fsym_security_id.latest,
    },
)

# Reminder: there is a trailing 1-year holdout on this dataset.
df = run_pipeline(pipe, '2016-05-01', '2017-05-05')
df.head()
Out[1]:
is_primary listing_currency listing_exchange primary_fsym_security_id security_type
2016-05-02 00:00:00+00:00 Equity(2 [ARNC]) True USD NYS PP17YF-S SHARE
Equity(21 [AAME]) True USD NAS Q2PB1Q-S SHARE
Equity(24 [AAPL]) True USD NAS R85KLC-S SHARE
Equity(25 [ARNC_PR]) False None None None None
Equity(31 [ABAX]) True USD NAS T7H04W-S SHARE

Tradable Universe Example

The following cell constructs and runs a pipeline of Canadian equities that gets screened down to a tradable universe. The tradable universe is defined to be the top 50% of equities (ranked by market cap) of security type SHARE that are also the primary share of the company.

In [2]:
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.factset import EquityMetadata, Fundamentals
from quantopian.pipeline.factors import Returns
from quantopian.pipeline.domain import CA_EQUITIES
from quantopian.research import run_pipeline

# Create a latest market cap factor.
mcap = Fundamentals.mkt_val.latest

# Create a pipeline filter for 'tradable' stocks.
is_tradable = (
    EquityMetadata.security_type.latest.eq('SHARE') 
    & EquityMetadata.is_primary.latest
)

# Create a base universe filter that selects the top 50% of our 'tradable'  
# equities based on market cap.
base_universe = mcap.percentile_between(50, 100, mask=is_tradable)

# Build a pipeline over the Canadian equities domain and screen down to
# a set of stocks that pass our base_universe filter.
pipe = Pipeline(
    domain=CA_EQUITIES,
    screen=base_universe,
)

df = run_pipeline(pipe, '2015-05-05', '2017-05-05')
In [3]:
df.head()
Out[3]:
2015-05-05 00:00:00+00:00 Equity(1178883868150594 [KAR])
Equity(1178892628414550 [CET])
Equity(1178896755081796 [TMB])
Equity(1178900628985687 [MIO.H])
Equity(1178900948861510 [CGLD])

Now that we have a base universe, we can add in columns like fundamentals and returns data.

In [4]:
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.factset import EquityMetadata, Fundamentals
from quantopian.pipeline.factors import Returns
from quantopian.pipeline.domain import CA_EQUITIES
from quantopian.research import run_pipeline

# Create a latest market cap factor.
mcap = Fundamentals.mkt_val.latest

# Create a pipeline filter for 'tradable' stocks.
is_tradable = (
    EquityMetadata.security_type.latest.eq('SHARE') 
    & EquityMetadata.is_primary.latest
)

# Create a base universe filter that selects the top 50% of our 'tradable'  
# equities based on market cap.
base_universe = mcap.percentile_between(50, 100, mask=is_tradable)

# Create a trailing 1-week returns factor and a latest quarterly sales factor.  
latest_sales = Fundamentals.sales_qf.latest
returns_1w = Returns(window_length=6)

# Build a pipeline with our returns and quarterly sales factors. The pipeline is defined  
# over the Canadian equities domain and gets screened down to our base universe.  
pipe = Pipeline(
    columns={
        'quarterly_sales': latest_sales,
        'returns': returns_1w,
        'market_cap': mcap,
    },
    domain=CA_EQUITIES,
    screen=base_universe,
)

# Execute our pipeline.
df = run_pipeline(pipe, '2015-05-05', '2016-05-05')
In [5]:
df.head()
Out[5]:
market_cap quarterly_sales returns
2015-05-05 00:00:00+00:00 Equity(1178883868150594 [KAR]) 15691300.0 0.000000e+00 0.018657
Equity(1178892628414550 [CET]) 99086400.0 7.324200e+07 -0.012766
Equity(1178896755081796 [TMB]) 300000000.0 3.480000e+08 -0.089286
Equity(1178900628985687 [MIO.H]) 9815430.0 0.000000e+00 0.000000
Equity(1178900948861510 [CGLD]) 33968900.0 1.162110e+07 -0.043478

What labels are included with each field?

The best way to see the set of possible labels for a field is to include it in a pipeline, output the result, and look at the unique values. Below is an example of how to get the list of all currencies included in the GB_EQUITIES (Great Britain) domain.

In [6]:
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.factset import EquityMetadata
from quantopian.pipeline.domain import GB_EQUITIES
from quantopian.research import run_pipeline

pipe = Pipeline(
    columns={
        'listing_currency': EquityMetadata.listing_currency.latest,
    },
    domain=GB_EQUITIES,
)

df_currencies = run_pipeline(pipe, '2004-01-02', '2018-01-10')
In [7]:
print 'Listing Currencies in GB_EQUITIES domain:'
for currency in df_currencies.listing_currency.dropna().unique():
    print currency
Listing Currencies in GB_EQUITIES domain:
GBP
EUR
JPY
USD
CHF
DKK
SEK
CAD
NOK
PLN
HKD
AUD
RON
CZK
HUF
ISK
SGD
CNY
BGN

GB_EQUITIES has the most diverse set of currencies of all supported markets on Quantopian. Let's see what the distribution of all these currency labels looks like on 2018-01-05:

In [8]:
currency_counts = df_currencies.loc[('2018-01-05', slice(None)), :].groupby('listing_currency').size()
currency_counts.sort_values(ascending=False)
/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py:2193: FutureWarning: 
Setting NaNs in `categories` is deprecated and will be removed in a future version of pandas.
  ordered=self.grouper.ordered))
/usr/local/lib/python2.7/dist-packages/pandas/indexes/category.py:121: FutureWarning: 
Setting NaNs in `categories` is deprecated and will be removed in a future version of pandas.
  data = data.set_categories(categories)
Out[8]:
listing_currency
GBP    3617
EUR    2524
USD    1186
SEK     368
CHF     336
PLN     331
NOK     192
BGN     134
DKK     128
RON      62
HUF      33
CAD      25
JPY      16
CZK      14
ISK      12
AUD       2
CNY       1
NaN       1
SGD       1
SKK       0
AED       0
MYR       0
DEM       0
LTL       0
HKD       0
ZAR       0
BDT       0
HRK       0
NZD       0
LVL       0
CLP       0
CNH       0
ILS       0
RUB       0
COP       0
dtype: int64

Usable in the Contest

The Equity Metadata dataset is best used for constructing a tradable universe in international markets, but it can be used in other situations as well. Try exploring the data and see if you can come up with any ideas to include in existing strategies or new strategies altogether!

Equity Metadata is available in pipelines in the IDE/Backtester (US_EQUITIES domain only) as well as being available in Research. Therefore, algorithms that use Equity Metadata are eligible for the contest and will be considered for an allocation.