Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
data.history retrieves Panel as opposed to DataFrame?

Hi,

Relatively new to Python so hopefully not a stupid question:

I'm attempting to generate multiple DataFrames of historic pricing,one per set of securities, and putting them in a dictionary, but for some reason I'm getting a Pandas Panel instead:

for industry in context.industrydic.keys():  
    context.frame = data.history(context.industrydic[industry],['close'], context.lookback,'1d')  
    **context.frame.columns = map(lambda x: x, context.frame.columns)**  
    context.priceframe[industry] = context.frame

On the marked line I'm getting an error since the Panel doesn't have columns. For context, industrydic is a dictionary with the keys being Morningstar industry codes, and the values being a list of tickers with that industry code.

I've read a bit about it and would only expect to get a Panel if asking for multiple fields on multiple securities, which I'm not.

This works fine in Research but when I move to Algorithm IDE it starts breaking.

Any ideas please let me know!

Thanks!

4 responses

Maybe attach a backtest showing your code?

I was able to paste the above into an algorithm and have it run without error. See attached algo. Verify it's what you had in mind.

Also, what are you are trying to accomplish? One typically wouldn't create a dictionary of dataframes. Kinda like mixing apples (the dataframe construct) with oranges (the dictionary construct). Not saying it can't be done, you just might find it's cumbersome. The more typical approach is to stick with a single construct (eg a dataframe) and put all your data into a single dataframe. You can slice and dice and index dataframes (to select a specific industry for example) just like dictionaries. Simply add a column or index for industry. Just a thought.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks for getting back so quick!

Mine is very similar to yours, but I have lists of symbols per industrydic key, e.g:

context.industrydic = {'tech': [symbol('AAPL'), symbol('MSFT')],  
                       'finance': [symbol('BAC'),symbol('WFC')],  
                       'consumer': [symbol('PG'),symbol('UN')]  
                       }

which creates a Panel rather than a DataFrame - in the docs it says the below which seems to contradict this - but I'm probably wrong somehow:

If multiple assets and a single field are requested, the returned value is a pd.DataFrame with shape (bar_count, len(assets)). The frame's index will be a pd.DatetimeIndex, and its columns will be assets.

Instead I get a Panel with 1 item 'close'

The reason I'm doing this is because the next stage is to check each DataFrame for cointegrated pairs, and I didn't want to check for pairs not in the same industry, both since it would take longer computationally and would be less relevant to look for pairs across the whole universe.

The data.history method is returning a panel because it's being given a list of assets and a list of fields. Hmm (I can hear you saying). I only have a single field close. While that's true, since the field is in brackets, the method interprets it as a list (albeit a list of length 1). Therefore it returns a panel. The fix is easy enough. Remove the brackets around 'close'.


        context.frame = data.history(context.industrydic[industry],'close', context.lookback,'1d')  

That should do what you want. Good luck.

Ah! Knew it would be something stupid!

Thanks very much!