Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Cant Create a columns for SID & Symbols Name

Hi Fellow Quants,

I am trying to build a data set that creates a Data Frame with the following columns:

SID, Symbol Name, Price, Volume, Market Cap, Super Sector Name, Super Sector Code, Sector Name, Sector Code, Industry Group, Industry Group Code, Industry Name, Industry Code

I took the sector & industry classifications based on the MorningStar classification on the fundamentals page and inserted them into a CSV that I am pulling using the local_csv function.

https://www.quantopian.com/help/fundamentals

The challenge is that when i merged the 2 datasets and deduplicated on column names, I lost the column created by default for pipelines that shows SID & Symbol it looks like "Equity(24 [AAPL]) ".

Can someone look at my code and show me how my merge between the 2 data frames is removing this column. This or how to pull SID & Symbol name as a column would be greatly appreciated.

4 responses

The issue is that the 'SID & Symbol which looks like "Equity(24 [AAPL])' is the asset object. This is the index of the pipeline dataframe. The 'merge' method ignores this index when joining on columns (see http://pandas.pydata.org/pandas-docs/version/0.18/generated/pandas.merge.html ). To get around that, first move the index to a column, then do the merge, then set the index back to the index column. This can be done in one line.

result10.reset_index().merge(test, on=['Industry Code', 'Super Sector Code', 'Sector Code', 'Industry Group Code']).set_index('index')

This assumes the pipeline output is 'result10'. I didn't actually try this and it may get confused with the pipeline output having a multi-index of both assets AND dates, but this is a general approach. You may also need to add a 'how' parameter to the 'join' method? One subtle point in the above code is that, by default, the 'reset_index' method sets the name for the new column it makes to 'index'. That is why 'set_index('index')' works.

I didn't have the CSV file to test this with, so again my apologies if it doesn't work exactly. The general approach however should work.

Good luck.

Hey Dan,

"how to pull SID & Symbol name as a column" Did you get it ? Can you tell me ?

Good question. One often needs the SID or symbol to filter, sort, or select on. Pandas often works much better if these values are in columns rather than part of the index. There are a couple of approaches to adding these values as a new dataframe column.

# assume notebook_df is the dataframe returned by a pipeline  
# one approach is to use the pandas `map` method  
notebook_df['symbol'] = notebook_df.index.get_level_values(level=1).map(lambda asset: asset.symbol)  
notebook_df['sid'] = notebook_df.index.get_level_values(level=1).map(lambda asset: asset.sid)

# another approach is to make a list using python's list comprehension  
assets = notebook_df.index.get_level_values(level=1)  
notebook_df['symbol'] = [asset.symbol for asset in assets]  
notebook_df['sid'] = [asset.sid for asset in assets]

There is a quirk with this approach however. The data stored in the asset objects is always the current data. This data is not 'point in time' like all other data. This isn't a problem with SID. That value is constant throughout the lifetime of a security. However, the symbol may change. The symbol returned will be the symbol as of today. It may not be the symbol as of the pipeline date. To get the symbol as of the pipeline date, then use the Morningstar fundamentals symbol data field in a pipeline definition.

from quantopian.pipeline.data.morningstar import Fundamentals  
# Create a classifier for the historical stock symbol as of the pipeline date  
symbol = Fundamentals.symbol.latest

There is a little more detail along with a notebook showing this in action in this post https://www.quantopian.com/posts/how-to-change-asset-objects-to-string-in-dataframe-returned-from-run-pieline

Good luck.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks a lot.