Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
security data sometimes contain nans - why?

The attached algo sometimes results in nans in the trailing window of data provided by history (see the code and the log output). Shouldn't the missing data be filled? Or am I not using history correctly?

--Grant

8 responses

Hello Grant,

I cloned and tried. Looks like there are some missing data. I changed the history(60) to history(6) and get the same result. The missing ones has been pretty stable. For example you can't get the open price of XLI on 2008-2-13

Thanks Jiaming,

I suspect somethin' ain't workin' right here:

https://github.com/quantopian/zipline/blob/5dbedfdce8619996a77202fd9dc5a7a60987a7e7/zipline/history/history_container.py

By convention, how are daily bars recorded in the trading industry? Shouldn't the open price be the first price recorded for the day (regardless of when the trade occurred)? It seems that history is looking for a trade at market open and if it doesn't find one, then attempting to do a forward fill from the prior day.

Grant

By convention it should be the first 'print'of that day.
For example yesterday a stock was trading at 100 by the closing bell, today nothing really happened until the first trade was 101 at 9:37, then the open of today should be 101.

Hi, I checked the code you provided previously. Looks like the NaNs of the morning will be filled with last trading day's closing price.
On line 55-64

def ffill_day_frame(field, day_frame, prior_day_frame):  
    # get values which are nan-at the beginning of the day  
    # and attempt to fill with the last close  
    first_bar = day_frame.ix[0]  
    nan_sids = first_bar[np.isnan(first_bar)]  
    for sid, _ in nan_sids.iterkv():  
        day_frame[sid][0] = prior_day_frame.ix[-1, sid]  
    if field != 'volume':  
        day_frame = day_frame.ffill()  
    return day_frame  

Jiaming,

Thanks...yeah, that was my interpretation when I looked at it briefly this morning. I'm confused how history is generating the trailing window of daily OHLC bars. Per the convention you confirmed above, there should be no need for filling of nans unless no trade occurred at all on a given day. It seems that the code should be scanning over each security each day and finding the first trade, and then assigning the daily open price as the open price for that first minute bar, right? And similarly, for the close, the code would find the last trade for each day for each security, which would correspond to the closing price of the last trading minute of a given security.

Grant

Grant, if you try history(1), you can actually get perfect data. I have no more clue.

Hi Grant,

Up until recently Zipline's history implementation was calculating daily open prices by only looking at the first traded price for the day. We recently did a fair
amount of work to overhaul history in prep for supporting more data frequencies, and in the course of that work we reviewed the semantics for open and decided
that open should be the price of the first trade in the period (as opposed to the price on the first bar of the period).

See https://github.com/quantopian/zipline/commit/bad4c9a4398d7173ad6cc0f3e3d63f636ecc4c98 for details.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks Scott,

I'm definitely no expert, but it sounds like the change will bring daily bars returned by history in line with conventional practice. Good to hear that additional data frequencies will be added, too.

Grant