Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
interpretation of difference between minutely open_price and prior close_price?

For minutely bars, how should one interpret the difference between the current open_price and the prior close_price (assuming continuous minute-to-minute trading)?

As an example, I used this code:

def initialize(context):  
    context.spy = symbol('SPY')  
    context.prior_close = None  
def handle_data(context, data):  
    open_price = history(1,'1m','open_price')  
    close_price = history(1,'1m','close_price')  
    if context.prior_close != None:  
        print float(open_price.values - context.prior_close)  
    context.prior_close = close_price.values  

A sample of the output is:

2014-11-10 PRINT 0.0
2014-11-10 PRINT 0.00999999999999
2014-11-10 PRINT -0.00999999999999
2014-11-10 PRINT 0.01
2014-11-10 PRINT -0.01
2014-11-10 PRINT 0.01
2014-11-10 PRINT 0.00699999999998

So, there is ~ +/- 0.01 peak-to-peak variation.

Is the variation real, corresponding to actual trade data? Is the data acquisition system capturing the absolute last trade just before or at the close of the minute, and then capturing the first trade immediately after the open of the minute? Or am I looking at noise?

My thinking here is that if there is no common market clock to use for sampling the market, the variation may just be noise, since trades for SPY must be effectively continuous versus time.

How frequently is the market sampled to compute the minute bars provided by Quantopian? Or do you capture 100% of the trades within a minute to compute the OHLCV bar? And what clock is used to assign sub-minute timestamps to those trades?

In general, it would be interesting to hear from the Quantopian data vendor(s) to get insight into how they manage to capture the data and reduce it to bars (particularly on-the-fly for real-time trading).

Grant

12 responses

Hi Grant,

When you refer to the 'current open_price' what you're actually referring to is the previous bar's open_price. So if the bar is currently at 9:45:00 AM and you call history(1,'1m','open_price'), what you're actually looking at is the open price for 9:44:00 AM and when you call history(1, '1m', 'close_price') what you're looking at is the close price for 9:44:59 AM.

I think the confusion might be coming from the fact that whichever bar you're currently on (e.g. 10:00 AM) whatever price data that you're getting is from the previous bar 9:59 AM which is why you're seeing the variations

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

I can't speak for Quantopian, but generally when building bars from tick data, the close of the bar is the last tick strictly prior to the first second of the next bar, and the open price is the first tick of the first second of the bar. Therefore they might be the same, or they might be different, but very rarely is the difference significant.

This is completely opposite of daily bars, of course, where the open price is an auction price to distribute pent up demand and supply accumulated overnight. There, the difference between yesterday's close and today's open is very significant.

Thanks Simon.

@Seong, what I'm after is precisely as Simon describes. How do you define and obtain closing & opening prices for minutely data? Is there any reason to think there would be any information in their difference, from minute 0 to minute 1? If the question still isn't clear, just let me know and I'll cook up an example.

Seong, Simon,

Here's some code I've been fiddling with. My read so far is that at least for SPY, the difference between the minutely prior close and the current open is not just random noise. It is not rigorous, but I think that the attached algorithm would not basically track the market if the difference (used here as an indicator) were random in time. If valid, I think that the result is consistent with the idea that the market for a security like SPY is really operating at a timescale much shorter than a minute, and this attribute is captured in the backtest data.

Grant

def initialize(context):  
    context.spy = symbol('SPY')  
    context.prior_close = None # closing price of prior minute  
    context.delta_oc_limit = 0.01 # limit to determine allocation  
    # turn off commissions  
    set_commission(commission.PerTrade(cost=0))  
    # https://www.quantopian.com/posts/trade-at-the-open-slippage-model  
    set_slippage(TradeAtTheOpenSlippageModel(0)) # 0 is perfect live trading execution  
def handle_data(context, data):  
    open_price = history(1,'1m','open_price')  
    close_price = history(1,'1m','close_price')  
    hour = get_datetime('US/Eastern').hour  
    minute = get_datetime('US/Eastern').minute  
    open = (hour == 9 and minute == 31)  
    if context.prior_close != None and not open:  
        # compute difference between prior close and current open  
        delta_oc = float(open_price.values - context.prior_close)  
        # apply trading rules  
        if delta_oc > context.delta_oc_limit:  
            order_target_percent(context.spy, 1)  
        elif delta_oc < -context.delta_oc_limit:  
            order_target_percent(context.spy, 0)  
    # store prior closing price  
    context.prior_close = close_price.values  
    # Custom slippage model  
    # https://www.quantopian.com/posts/trade-at-the-open-slippage-model  
    ########################################################  
# Slippage model to trade at the open or at a fraction of the open - close range.  
class TradeAtTheOpenSlippageModel(slippage.SlippageModel):  
    '''Class for slippage model to allow trading at the open  
       or at a fraction of the open to close range.  
    '''  
    # Constructor, self and fraction of the open to close range to add (subtract)  
    #   from the open to model executions more optimistically  
    def __init__(self, fractionOfOpenCloseRange):

        # Store the percent of open - close range to take as the execution price  
        self.fractionOfOpenCloseRange = fractionOfOpenCloseRange

    def process_order(self, trade_bar, order):  
        openPrice = trade_bar.open_price  
        closePrice = trade_bar.price  
        ocRange = closePrice - openPrice  
        ocRange = ocRange * self.fractionOfOpenCloseRange  
        if (ocRange != 0.0):  
            targetExecutionPrice = openPrice + ocRange  
        else:  
            targetExecutionPrice = openPrice  
        # log.info('\nOrder:{0} open:{1} close:{2} exec:{3} side:{4}'.format(  
            # trade_bar.sid.symbol, openPrice, closePrice, targetExecutionPrice, order.direction))

        # Create the transaction using the new price we've calculated.  
        return slippage.create_transaction(  
            trade_bar,  
            order,  
            targetExecutionPrice,  
            order.amount  
        )  

Your strategy is long-only, and enters when O-C > 0.01? That sounds like a strategy that will buy every up-day morning at 9:30am and hold for one minute, plus randomly when the market moves up two ticks at once at that boundary. Maybe try after excluding 1600->0930.

Thanks Simon,

Yeah, it finally sunk in that it is not purely intraday, since it could end up holding overnight and then selling, which would allow for significant price drift (if I'm thinking about it correctly). Maybe this would be a chance to try that new function scheduler helper method that Quantopian just released.

Grant

For illiquid securities, there are missing bars in the Quantopian backtesting database, which Quantopian sometimes fills forward (depending on how the backtest algo is configured). But, I guess it is possible that there is forward filling by Nanex for certain cases. I'm figuring that O is the first price, H (L) is the absolute high (low), and C is the last price, within a minute-long time window (which is defined either by the stamps on the trades on the tape, or by Nanex's own clock or one provided with the tape). If a minutely bar is defined as I think, then if there is only one trade within a given minute (or all of the trades are coincidentally at exactly the same price), then one could have a O=H=L=C bar.

Well, there's no Quantopian clock, per se. The function handle_data only gets called if there is data, which means that at least one bar for all of the securities available to the algo is not empty in the database. Typically, it is recommended that SPY be added to the algorithm (even if it is not traded) as a proxy for a market clock.

There's a rather extensive discussion here:

https://www.quantopian.com/posts/thinly-traded-stocks-why-no-gaps

As Fawce shared there, you can test:

# http://seekingalpha.com/article/123969-five-thinly-traded-stocks-with-plenty-of-upside  
def initialize(context):  
    context.stocks = [sid(3722),sid(33054),sid(16812),sid(5107),sid(1675)]  
def handle_data(context, data):  
    for stock in data:  
        if data[stock].datetime < get_datetime():  
            log.info("no trades in {s}".format(s=stock.symbol))  

The history API also allows one to toggle filling on/off:

Illiquidity and Forward Filling

By default, history methods will forward fill missing bars. If there is trade data missing for a security one day, the function will use the previous known price until there is new trade data.

However, it can be useful to have visibility into gaps in the trading history. We provide an option for the history methods, which disables forward filling. e.g., history(bar_count=15, frequency='1d', field='price', ffill=False) will return a DataFrame that has nan value for price for days on which a given security did not trade.

Indeed, one of the problems I am facing now is limit order fill simulation. For illiquid ETFs that trade a few dozen times a day, how likely, if at all, is it that I could get filled on a limit order at or between the spread? I literally have no idea, I am contemplating getting out TWS Workstation and just trying a few test trades to gather some data. Assuming market orders and cross-spread fills for illiquid ETFs is much easier to model, and more likely, but then you have to high bar of profiting from mean reversion while swallowing the huge spreads each round trip. Nobody said it would be easy lol.

@Simon,

There should be a way to back out a probability distribution, if you can assume that whatever orders you submit are similar to those that resulted in historical trades. There's a procedure (which I've never used) to obtain an empirical characteristic function (see http://en.wikipedia.org/wiki/Characteristic_function_%28probability_theory%29, 'Data Analysis'). I'll let you know if I learn anything useful.

Grant

Yeah... This is one of the huge motivations for going live fast - getting fill data.

Here's an algo that I cobbled together than uses the current_open - prior_close indicator. It's not realistic, since the transaction costs are set to zero, but I think it illustrates that the indicator may contain information, rather than just noise. --Grant