There are four classifiers which may be helpful and are all provided by Morningstar.
These all return classifiers with integer values. The corresponding names can be found in the Morningtar documentation. See Appendix: Classification Values https://www.quantopian.com/help/fundamentals#appendix
Since these data reside in Fundamentals they are retrieved using pipeline.
#First import some things we will need to run pipeline and get our data
from quantopian.research import run_pipeline
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data import Fundamentals
from quantopian.pipeline.classifiers.morningstar import Sector
from quantopian.pipeline.filters import Q1500US
Define our pipeline with these four classifiers
# Define pipeline classifiers with the 'latest' method
economy_sphere = Fundamentals.morningstar_economy_sphere_code.latest
sector = Fundamentals.morningstar_sector_code.latest
industry_group = Fundamentals.morningstar_industry_group_code.latest
industry = Fundamentals.morningstar_industry_code.latest
# Since sector is used a lot it is a built in classifier. Remember to import it.
sector_built_in = Sector()
Create the Pipeline
pipe = Pipeline(
columns={
'economy_sphere' : economy_sphere,
'sector': sector,
'sector_built_in': sector_built_in,
'industry_group': industry_group,
'industry': industry,
},
)
Run the Pipeline
results = run_pipeline(pipe, '2018-03-20', '2018-03-20')
Show the results
results
First notice that sector and sector_built_in are the same (which we expected) This dataframe is indexed by datetime AND the equity objects. To make things easier, drop the datetime index (level 0). Maybe use the '.xs' method.
classification_df = results.xs('2018-03-20')
classification_df
# Now, looking up a particular value is simple.'
# There are a lot of ways this can be done but, to get the industry for example, could be like this
aapl = symbols('AAPL')
aapl_industry = classification_df.industry[aapl]
aapl_industry
# The '.at' method is a bit faster
aapl_industry = classification_df.at[aapl, 'industry']
aapl_industry
Create a filter to get only specified industry groups
# Make a list of the industry groups we want (or maybe exclude as the case may be)
my_industries = [31054, 20639]
# Use the 'element_of' method to create a filter from this list
my_industry_filter = industry_group.element_of(my_industries)
# Now run our pipeline with this filter as a screen
filtered_pipe = Pipeline(
columns={
'economy_sphere' : economy_sphere,
'sector': sector,
'sector_built_in': sector_built_in,
'industry_group': industry_group,
'industry': industry,
},
screen = my_industry_filter
)
results = run_pipeline(filtered_pipe, '2018-03-20', '2018-03-20')
results
Notice only our two industry groups 31054 and 20639 are returned