Notebook

Get Industry Codes Using Pipeline

There are four classifiers which may be helpful and are all provided by Morningstar.

  • morningstar_economy_sphere_code
  • morningstar_sector_code
  • morningstar_industry_group_code
  • morningstar_industry_code

These all return classifiers with integer values. The corresponding names can be found in the Morningtar documentation. See Appendix: Classification Values https://www.quantopian.com/help/fundamentals#appendix

Since these data reside in Fundamentals they are retrieved using pipeline.

In [1]:
#First import some things we will need to run pipeline and get our data
from quantopian.research import run_pipeline
from quantopian.pipeline import Pipeline

from quantopian.pipeline.data import Fundamentals

from quantopian.pipeline.classifiers.morningstar import Sector
from quantopian.pipeline.filters import Q1500US

Define our pipeline with these four classifiers

In [2]:
# Define pipeline classifiers with the 'latest' method
economy_sphere = Fundamentals.morningstar_economy_sphere_code.latest
sector = Fundamentals.morningstar_sector_code.latest
industry_group = Fundamentals.morningstar_industry_group_code.latest
industry = Fundamentals.morningstar_industry_code.latest
    
# Since sector is used a lot it is a built in classifier. Remember to import it.
sector_built_in = Sector()
 

Create the Pipeline

In [3]:
pipe = Pipeline(
    columns={
        'economy_sphere' : economy_sphere,
        'sector': sector,
        'sector_built_in': sector_built_in,
        'industry_group': industry_group,
        'industry': industry,
    },
)

Run the Pipeline

In [4]:
results = run_pipeline(pipe, '2018-03-20', '2018-03-20')

Show the results

In [5]:
results
Out[5]:
economy_sphere industry industry_group sector sector_built_in
2018-03-20 00:00:00+00:00 Equity(2 [ARNC]) 3 31061119 31061 310 310
Equity(21 [AAME]) 1 10324058 10324 103 103
Equity(24 [AAPL]) 3 31167138 31167 311 311
Equity(25 [ARNC_PR]) 1 10106008 10106 101 101
Equity(31 [ABAX]) 2 20640091 20640 206 206
Equity(41 [ARCB]) 3 31062127 31062 310 310
Equity(52 [ABM]) 3 31054109 31054 310 310
Equity(53 [ABMD]) 2 20639090 20639 206 206
Equity(62 [ABT]) 2 20639090 20639 206 206
Equity(64 [ABX]) 1 10106010 10106 101 101
Equity(66 [AB]) 1 10319042 10319 103 103
Equity(67 [ADSK]) 3 31165133 31165 311 311
Equity(70 [VBF]) 1 10319042 10319 103 103
Equity(76 [TAP]) 2 20529071 20529 205 205
Equity(84 [ACET]) 2 20636086 20636 206 206
Equity(100 [IEP]) 3 31055110 31055 310 310
Equity(106 [ACU]) 2 20531076 20531 205 205
Equity(110 [ACXM]) 3 31165132 31165 311 311
Equity(112 [ACY]) 3 31056111 31056 310 310
Equity(114 [ADBE]) 3 31165133 31165 311 311
Equity(117 [AEY]) 3 31167142 31167 311 311
Equity(122 [ADI]) 3 31169147 31169 311 311
Equity(128 [ADM]) 2 20531075 20531 205 205
Equity(149 [ADX]) 1 10319042 10319 103 103
Equity(153 [AE]) 3 30948103 30948 309 309
Equity(154 [AEM]) 1 10106010 10106 101 101
Equity(157 [AEG]) 1 10323057 10323 103 103
Equity(161 [AEP]) 2 20744096 20744 207 207
Equity(166 [AES]) 2 20744095 20744 207 207
Equity(168 [AET]) 2 20637087 20637 206 206
... ... ... ... ... ...
Equity(51807 [BOON]) -1 -1 -1 -1 -1
Equity(51808 [NEBU]) 3 31055110 31055 310 310
Equity(51809 [DINT]) -1 -1 -1 -1 -1
Equity(51810 [NEBU_W]) -1 -1 -1 -1 -1
Equity(51811 [EAGL]) 3 31055110 31055 310 310
Equity(51812 [EAGL_W]) -1 -1 -1 -1 -1
Equity(51813 [RFL]) 2 20635084 20635 206 206
Equity(51814 [BTAI]) 2 20635084 20635 206 206
Equity(51815 [GLIB_A]) 3 30845100 30845 308 308
Equity(51817 [GLIB_P]) 3 30845100 30845 308 308
Equity(51818 [IDT_WI]) -1 -1 -1 -1 -1
Equity(51820 [BWB]) 1 10320050 10320 103 103
Equity(51821 [CMSA]) -1 -1 -1 -1 -1
Equity(51822 [GPAQ]) -1 -1 -1 -1 -1
Equity(51823 [GPAQ_W]) -1 -1 -1 -1 -1
Equity(51824 [MUDS]) -1 -1 -1 -1 -1
Equity(51825 [MUDS_W]) -1 -1 -1 -1 -1
Equity(51826 [ORGS]) -1 -1 -1 -1 -1
Equity(51827 [RCUS]) 2 20635084 20635 206 206
Equity(51828 [OPES_U]) 3 31055110 31055 310 310
Equity(51830 [TIBR_U]) 3 31055110 31055 310 310
Equity(51831 [AIHS]) 1 10322056 10322 103 103
Equity(51832 [ZS]) 3 31165134 31165 311 311
Equity(51833 [COGT_V]) -1 -1 -1 -1 -1
Equity(51834 [AIZP]) -1 -1 -1 -1 -1
Equity(51835 [RDVT]) -1 -1 -1 -1 -1
Equity(51836 [QTS_PRA]) -1 -1 -1 -1 -1
Equity(51837 [CODI_PRB]) -1 -1 -1 -1 -1
Equity(51841 [FIYY]) -1 -1 -1 -1 -1
Equity(51842 [FFEU]) -1 -1 -1 -1 -1

8573 rows × 5 columns

First notice that sector and sector_built_in are the same (which we expected) This dataframe is indexed by datetime AND the equity objects. To make things easier, drop the datetime index (level 0). Maybe use the '.xs' method.

In [6]:
classification_df = results.xs('2018-03-20')
classification_df
Out[6]:
economy_sphere industry industry_group sector sector_built_in
Equity(2 [ARNC]) 3 31061119 31061 310 310
Equity(21 [AAME]) 1 10324058 10324 103 103
Equity(24 [AAPL]) 3 31167138 31167 311 311
Equity(25 [ARNC_PR]) 1 10106008 10106 101 101
Equity(31 [ABAX]) 2 20640091 20640 206 206
Equity(41 [ARCB]) 3 31062127 31062 310 310
Equity(52 [ABM]) 3 31054109 31054 310 310
Equity(53 [ABMD]) 2 20639090 20639 206 206
Equity(62 [ABT]) 2 20639090 20639 206 206
Equity(64 [ABX]) 1 10106010 10106 101 101
Equity(66 [AB]) 1 10319042 10319 103 103
Equity(67 [ADSK]) 3 31165133 31165 311 311
Equity(70 [VBF]) 1 10319042 10319 103 103
Equity(76 [TAP]) 2 20529071 20529 205 205
Equity(84 [ACET]) 2 20636086 20636 206 206
Equity(100 [IEP]) 3 31055110 31055 310 310
Equity(106 [ACU]) 2 20531076 20531 205 205
Equity(110 [ACXM]) 3 31165132 31165 311 311
Equity(112 [ACY]) 3 31056111 31056 310 310
Equity(114 [ADBE]) 3 31165133 31165 311 311
Equity(117 [AEY]) 3 31167142 31167 311 311
Equity(122 [ADI]) 3 31169147 31169 311 311
Equity(128 [ADM]) 2 20531075 20531 205 205
Equity(149 [ADX]) 1 10319042 10319 103 103
Equity(153 [AE]) 3 30948103 30948 309 309
Equity(154 [AEM]) 1 10106010 10106 101 101
Equity(157 [AEG]) 1 10323057 10323 103 103
Equity(161 [AEP]) 2 20744096 20744 207 207
Equity(166 [AES]) 2 20744095 20744 207 207
Equity(168 [AET]) 2 20637087 20637 206 206
... ... ... ... ... ...
Equity(51807 [BOON]) -1 -1 -1 -1 -1
Equity(51808 [NEBU]) 3 31055110 31055 310 310
Equity(51809 [DINT]) -1 -1 -1 -1 -1
Equity(51810 [NEBU_W]) -1 -1 -1 -1 -1
Equity(51811 [EAGL]) 3 31055110 31055 310 310
Equity(51812 [EAGL_W]) -1 -1 -1 -1 -1
Equity(51813 [RFL]) 2 20635084 20635 206 206
Equity(51814 [BTAI]) 2 20635084 20635 206 206
Equity(51815 [GLIB_A]) 3 30845100 30845 308 308
Equity(51817 [GLIB_P]) 3 30845100 30845 308 308
Equity(51818 [IDT_WI]) -1 -1 -1 -1 -1
Equity(51820 [BWB]) 1 10320050 10320 103 103
Equity(51821 [CMSA]) -1 -1 -1 -1 -1
Equity(51822 [GPAQ]) -1 -1 -1 -1 -1
Equity(51823 [GPAQ_W]) -1 -1 -1 -1 -1
Equity(51824 [MUDS]) -1 -1 -1 -1 -1
Equity(51825 [MUDS_W]) -1 -1 -1 -1 -1
Equity(51826 [ORGS]) -1 -1 -1 -1 -1
Equity(51827 [RCUS]) 2 20635084 20635 206 206
Equity(51828 [OPES_U]) 3 31055110 31055 310 310
Equity(51830 [TIBR_U]) 3 31055110 31055 310 310
Equity(51831 [AIHS]) 1 10322056 10322 103 103
Equity(51832 [ZS]) 3 31165134 31165 311 311
Equity(51833 [COGT_V]) -1 -1 -1 -1 -1
Equity(51834 [AIZP]) -1 -1 -1 -1 -1
Equity(51835 [RDVT]) -1 -1 -1 -1 -1
Equity(51836 [QTS_PRA]) -1 -1 -1 -1 -1
Equity(51837 [CODI_PRB]) -1 -1 -1 -1 -1
Equity(51841 [FIYY]) -1 -1 -1 -1 -1
Equity(51842 [FFEU]) -1 -1 -1 -1 -1

8573 rows × 5 columns

In [7]:
# Now, looking up a particular value is simple.'
# There are a lot of ways this can be done but, to get the industry for example, could be like this
aapl = symbols('AAPL')
aapl_industry = classification_df.industry[aapl]
aapl_industry
Out[7]:
31167138
In [8]:
# The '.at' method is a bit faster
aapl_industry = classification_df.at[aapl, 'industry']
aapl_industry
Out[8]:
31167138

Create a filter to get only specified industry groups

In [10]:
# Make a list of the industry groups we want (or maybe exclude as the case may be)
my_industries = [31054, 20639]

# Use the 'element_of' method to create a filter from this list
my_industry_filter = industry_group.element_of(my_industries)

# Now run our pipeline with this filter as a screen
filtered_pipe = Pipeline(
    columns={
        'economy_sphere' : economy_sphere,
        'sector': sector,
        'sector_built_in': sector_built_in,
        'industry_group': industry_group,
        'industry': industry,
    },
    screen = my_industry_filter
)
results = run_pipeline(filtered_pipe, '2018-03-20', '2018-03-20')
results
Out[10]:
economy_sphere industry industry_group sector sector_built_in
2018-03-20 00:00:00+00:00 Equity(52 [ABM]) 3 31054109 31054 310 310
Equity(53 [ABMD]) 2 20639090 20639 206 206
Equity(62 [ABT]) 2 20639090 20639 206 206
Equity(225 [AHPI]) 2 20639090 20639 206 206
Equity(630 [ADP]) 3 31054109 31054 310 310
Equity(680 [AXR]) 3 31054109 31054 310 310
Equity(1131 [BSX]) 2 20639090 20639 206 206
Equity(1706 [CNMD]) 2 20639090 20639 206 206
Equity(1941 [CTAS]) 3 31054109 31054 310 310
Equity(2212 [DLX]) 3 31054109 31054 310 310
Equity(2237 [DNB]) 3 31054109 31054 310 310
Equity(2248 [RRD]) 3 31054109 31054 310 310
Equity(2391 [DYNT]) 2 20639090 20639 206 206
Equity(2465 [EFX]) 3 31054109 31054 310 310
Equity(2691 [CLGX]) 3 31054109 31054 310 310
Equity(2853 [FISV]) 3 31054109 31054 310 310
Equity(2945 [FONR]) 2 20639090 20639 206 206
Equity(3494 [HCSG]) 3 31054109 31054 310 310
Equity(3607 [HMSY]) 3 31054109 31054 310 310
Equity(3678 [MICR]) 2 20639090 20639 206 206
Equity(4084 [IVC]) 2 20639090 20639 206 206
Equity(4141 [JKHY]) 3 31054109 31054 310 310
Equity(4343 [LABL]) 3 31054109 31054 310 310
Equity(4413 [AXGN]) 2 20639090 20639 206 206
Equity(4758 [MDT]) 2 20639090 20639 206 206
Equity(5067 [MSON]) 2 20639090 20639 206 206
Equity(5601 [OFIX]) 2 20639090 20639 206 206
Equity(5767 [PAYX]) 3 31054109 31054 310 310
Equity(6539 [ROL]) 3 31054109 31054 310 310
Equity(7178 [SYK]) 2 20639090 20639 206 206
... ... ... ... ... ...
Equity(47708 [VEC]) 3 31054109 31054 310 310
Equity(47829 [ATTO]) 3 31054109 31054 310 310
Equity(48014 [NMRD]) 2 20639090 20639 206 206
Equity(48015 [VRAY]) 2 20639090 20639 206 206
Equity(48025 [NVRO]) 2 20639090 20639 206 206
Equity(48113 [EYES]) 2 20639090 20639 206 206
Equity(48545 [AVGR]) 2 20639090 20639 206 206
Equity(48986 [NAOV]) 2 20639090 20639 206 206
Equity(49054 [MDGS]) 2 20639090 20639 206 206
Equity(49145 [SPNE]) 2 20639090 20639 206 206
Equity(49176 [TRU]) 3 31054109 31054 310 310
Equity(49413 [PEN]) 2 20639090 20639 206 206
Equity(49496 [FDC]) 3 31054109 31054 310 310
Equity(49501 [LIVN]) 2 20639090 20639 206 206
Equity(49668 [CCRC]) 3 31054109 31054 310 310
Equity(49788 [NVTR]) 2 20639090 20639 206 206
Equity(49843 [EVGBC]) 2 20639090 20639 206 206
Equity(50041 [VIVE]) 2 20639090 20639 206 206
Equity(50147 [SRTS]) 2 20639090 20639 206 206
Equity(50157 [PAVM]) 2 20639090 20639 206 206
Equity(50169 [TCMD]) 2 20639090 20639 206 206
Equity(50312 [LKSD]) 3 31054109 31054 310 310
Equity(50351 [OBLN]) 2 20639090 20639 206 206
Equity(50421 [ZTO]) 3 31054109 31054 310 310
Equity(50533 [CNDT]) 3 31054109 31054 310 310
Equity(50808 [EEX]) 3 31054109 31054 310 310
Equity(50936 [MYO]) 2 20639090 20639 206 206
Equity(51287 [KIDS]) 2 20639090 20639 206 206
Equity(51299 [HAIR]) 2 20639090 20639 206 206
Equity(51634 [AVYA]) 3 31054109 31054 310 310

172 rows × 5 columns

Notice only our two industry groups 31054 and 20639 are returned