In the code below, I use this to pick the top n_stocks by market cap:
top_market_cap = market_cap.top(n_stocks, mask=has_market_cap & has_exchange_id & not_depositary & common_stock & not_otc & not_wi & not_lp_name & not_lp_balance_sheet)
What if I want to skip the first 50 stocks, for example, and select the next 50? Presently, I am outputting 100 stocks, and then dropping the first 50 outside of pipeline. It would be more elegant (and more efficient?) to simply select what I want from within pipeline. Is this possible?
Also, if you see ways to improve my filtering, please let me know.
Thanks,
Grant
def make_pipeline(n_stocks):
### Factors
# Create a factor for the market cap.
market_cap = mstar.valuation.market_cap.latest
# Create a factor for the exchange ID.
exchange_id = mstar.company_reference.primary_exchange_id.latest
### Filters
# Create a filter returning true for securities with a non-nan market cap.
has_market_cap = market_cap.notnan()
# Create a filter returning true for securities with a non-nan market cap.
has_exchange_id = exchange_id.notnull()
# has_exchange_id = exchange_id.eq('NAS')
# Equities not listed as depositary receipts by morningstar.
# Note the inversion operator, `~`, at the start of the expression.
not_depositary = ~mstar.share_class_reference.is_depositary_receipt.latest
# Equities that listed as common stock (as opposed to, say, preferred stock).
# This is our first string column. The .eq method used here produces a Filter returning
# True for all asset/date pairs where security_type produced a value of 'ST00000001'.
common_stock = mstar.share_class_reference.security_type.latest.eq(COMMON_STOCK)
# Equities whose exchange id does not start with OTC (Over The Counter).
# startswith() is a new method available only on string-dtype Classifiers.
# It returns a Filter.
not_otc = ~mstar.share_class_reference.exchange_id.latest.startswith('OTC')
# Equities whose symbol (according to morningstar) ends with .WI
# This generally indicates a "When Issued" offering.
# endswith() works similarly to startswith().
not_wi = ~mstar.share_class_reference.symbol.latest.endswith('.WI')
# Equities whose company name ends with 'LP' or a similar string.
# The .matches() method uses the standard library `re` module to match
# against a regular expression.
not_lp_name = ~mstar.company_reference.standard_name.latest.matches('.* L[\\. ]?P\.?$')
# Equities with a null entry for the balance_sheet.limited_partnership field.
# This is an alternative way of checking for LPs.
not_lp_balance_sheet = mstar.balance_sheet.limited_partnership.latest.isnull()
# Get the top n_stocks securities by market cap (for securities that have a market cap).
top_market_cap = market_cap.top(n_stocks, mask=has_market_cap & has_exchange_id & not_depositary & common_stock & not_otc & not_wi & not_lp_name & not_lp_balance_sheet)
# Combine all of our filters.
tradeable_filter = top_market_cap
return Pipeline(
columns={
'market_cap': market_cap
},
screen=tradeable_filter
)