Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Is this even close to correct?

Hi All,

Let me start by confessing that I am a complete novice to programming. Needless to say, I feel more than a little out of sorts when using Quantopian.

That said, I have been trying to code an algorithm for a strategy that I have often tested in excel.

Theory: ETF's are supposed to trade at or very near their NAV's. The degree of variation from the NAV may vary between ETF's and depends on various factors, but for a lone ETF, it should remain within an consistent range.

What we know: Often an ETF will trade above it's NAV and beyond the normal range of variation. This is usually caused by an excess of demand from the market, The normal method of correction for this is for the AP to buy more 'Creation Units' (buys units at NAV, sells at mkt price and makes a risk free profit)

Strategy: Buy 1,000 shares if the opening price is lower than the closing price (NAV) and short sell 1,000 shares if the opening price is higher than the closing price (NAV). This would translate to a trade a day i think

I wasn't sure on how to bring the NAV's into the algorithm and substituted the closing price in it's place, since ETF's always seem to close at or near their NAV for the day.

I am not sure of the following:
- is the algo closing the position at the end of each day?
- is it doing what I think it is doing (i.e. the strategy above)

Thanks,
GL

24 responses

this is the last backtest I ran

Hello Gautam,

A few comments:

  1. If I understand correctly, your algorithm will need to be developed for minute data, not daily. Basically, you need to record the closing price of the final minute of the prior trading day, and then compare it to the closing price of the opening minute of the current trading day.
  2. I don't know of a sure-fire way of closing out positions prior to the end of the trading day. One issue is that there are a few days when the market closes early, but there is no flag in the Quantopian database to indicate these days (presumably, one could create a custom flag with fetcher).
  3. The outline of your strategy makes sense, and can be implemented in Quantopian. However, the code you posted doesn't implement the strategy (if I understand it correctly).

I have some code that captures the closing price, and compares it to the opening price of the next day. It needs to be refactored, with comments...I may get to it in a few days and would be glad to share it.

Perhaps others have guidance on how to close out positions prior to market close?

Grant

Thanks for your insight Grant.

Re, closing the trade. Since the price always adjusts on the same day, would it be possible to close the trade by using a take profit criteria?

@Gautam, I'll be interested to see how your theory works out.

Here is a counter-theory of sorts:

ETFs adjust their NAV daily at close based on the market movements of the day. The intraday trading price is the market's anticipation of the new NAV at close. So the fact that the market price of an ETF is above or below yesterday's NAV in itself may not give you a trading edge.

But I do have another idea for you:

The ETF issuer must buy (or sell) stocks to account for the creation (or destruction) of Creation Units. They typically do this at daily close (when the new NAV is set). So you could theoretically watch the money flow for the ETF and trade on the squeeze (positive flow) or run-off (negative flow) of the underlying securities near close.

@Dennis

Re: the counter theory.
That is what I assumed in the beginning as well, but here's what I think,
- Market price move according to supply and demand alone
- As I understand it, Issuers can create/destroy only at the last 'official' NAV.

Therefore, if the opening price is over/under yesterday's close they will move to create/destroy ASAP and lock in a risk free profit (before the prices of the underlying securities moves too much and alters the NAV)
There are other 'real world' factors to consider of course (opening volumes, getting the price etc) that won't be known until live testing the strategy.
It is also worth noting that this mis-match happens more often and persists for longer periods with International ETF's (because of timing and availability i imagine)

Re: the money flow. I randomly picked an ETF on the Canadian iShares website (CRQ) and used their quotes & charts tool.
http://tools.ishares.wallst.com/ishares/qc/summary.asp?symbol=CRQ&pt=false&locale=en-CA&user_tier=CAIndv&user_id=INDIVIDUAL&countryCode=CA

Money flow does seem to track the price change through out the day. However, they don't give finer data, so it's hard to say whether it's a leading or lagging indicator. You almost need a momentum value for the money flow it self.
Is that what you were getting at with squeeze & run-off? (i'm not in the industry, this is a hobby/obsession)

GL

I meant 'squeeze' in the sense of a short squeeze. In this case although the ETF issuer isn't forced to cover a short position (as in the definition of 'short squeeze) they nonetheless have a forced buy constraint that could cause upward price pressure.

Likewise I was referring to portfolio runoff but meant it in the sense that the ETF issuer would be forced to sell off shares and could be trapped (or even cause) downward price pressure.

I'm a programmer not an industry insider. As a result I tend to abuse the terminology fairly badly. My apologies!

What I meant about money flow was to try to anticipate whether the ETF issuer would have a net gain or loss of Creation Units at the end of the day.

However if the ETF issuer is in the practice of buying and selling the underlying securities all through the day then no end-of-day effect should be seen at all.

Here's an excel file of this strategy executed on CRQ, an ETF on the TSX
CRQ - Open vs. NAV

I think to make this profitable on US ETFs you would need to know how often they calculate the NAV. And be able to get the NAV value as often as they do.

Hello Gautam,

To answer your question above, you could close the trade with a take-profit criterion, but if you never reach the trigger then you'll end up holding beyond the day's close. So, you'd still need to close out based on the time of day, if the take-profit criterion is not met.

Grant

Seems like a good idea, implementation could be improved by doing the following IMHO:

  1. Although the discussion mentions NAV, there is no attempt to spell out it is the net asset value, a sum of values of underlying securities. That would not be a problem, it is easy to google, but the real problem is the algorithm makes no attempt to compute the NAV either. There is no code that reads prices for constituents of the ETF, sums them up and compares them to the monitored ETF price.
  2. open_price and close_price are used as proxy for NAV. While this is dubious, there is another issue. open_price and close_price are price at the beginning and the end of the interval observed, not of the previous or current day open/close price. At least that's my recollection of reading the API. My reading of the discussion above leads me to believe you wanted to use the open/close price in a market sense, not in Quantopian API definition of the term. One has to be careful here too, open price will actually be the open price of the day when testing in day to day mode, but not when testing in minute to minute mode or live trading.

Running backtest of your algorithm from 2002 shows a huge loss. I had similar issues in my code, essentially the algorithm builds huge positions, leverage being hundreds if not thousands of times the initial capital. Then it proceeds to use such presumed margin to make trades. If they turn profitable, the profit shows up as huge, but the loss shows disastrous too. Without modification to the code the longer back tested run shows a loss that would bankrupt all but the largest of financial institutions. This is not the first time this makes me think of a sandbox concept, where loss would be limited to initial capital and a few more restrictions imposed on the algorithm that would be provided by the API or a wrapper to order()

To Quantopian: Looks like you have changed the interface backtest produces. There is no longer a tab with trades and positions. That was a good and quick way to see what got bought and soled and how algorighm is behaving. Is there a reason for removing it? Seems to have been quite a useful thing. Looking into old posts, the data is missing there too, perhaps it is only the javascript change in the web server, something you can put back?

Vlatko - Trades and positions have only ever been available in the "full backtest" result interface. They've never been available in the "widget" interface you see here in the community. To my knowledge, we haven't removed that information from any displays. Of course, please let me know if you think I'm mistaken.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Valtko - You are correct in pointing out the lack of an NAV component in the algorithm. I could not see a way to add it in and had to substitute the closing price (which closely matches the day's NAV). The python equivalent of a Get function would be able to pull it in from the web.

Closing prices tend to be close to the NAV, which is why i used it, but I would sub it out as soon as I know how to pull in the NAV.
I am not using the open price as a sub for the NAV.

I do see that the algorithm is not using the daily open and close prices, no idea what to do, which is why I attached an excel version of the back test (done manually). The steps I am trying to code in the algorithm goes like this:
1. market opens, the algorithm checks to see if the Opening Price on the market is above or below yesterdays NAV (+/- the avg. variance)
2. If the Open Price > NAV + Var, you short sell 1,000 units of the ETF and covers the short when the market price drops to the NAV value or at day's end
OR, If the Open Price < NAV-Var, you buy 1,000 units and sells them when the market price rises to meet the NAV or at days end

I don't know why the algorithm opens that many positions in the back test.

If you look at the Excel file, you can see the result when this is done manually according to the steps laid out above

You can use the Fetcher function to get external CSV formatted data sources.

https://www.quantopian.com/help#overview-fetcher

Gautam,

Here's an example you can try. Hopefully, you can get a feel for what's going on. It's not your full strategy, but perhaps a start...

Note that the order method submits an order; it does not fill the order.

Grant

@Dan, so when you share the backtest the transaction details get lost, logs are gone too now that I am looking closer. Thanks for pointing it out. Since the whole algorithm is shared there is not much security reason to keep the positions and logs out, but they are not essential either, just convenient.

@ Gautam, using previous day price can be computed if one keeps a memory of back ticks then in a separate function stores the last price where the date is lower than current date. One of those things I would like to add to some library and eventually see as part of the API. Similar algorithm can be used for opening price of the day.

More to the point, to compute the constituents of the basket one needs to know what the basket consists of. That may be relatively easy for Dow Jones, then a bit harder for Russel 5000, but there should be resources on the web that say what tickers are in which index, ETF or fund. It gets tricky with back testing as one needs to know when the set changes and how. A good case for something called reference data, as opposed to tick data Quantopian provides. Reference data is pretty much any qualitative description of securities traded/observed, including the constituents of an index or etf, announcement dates and similar. An example of change to be accounted for is Tesla becoming a part of S&P100 and Oracle being kicked out of the group.

If you have a basket, presumably the basket as it changes over time, you can manually add it to your program and use the stocks in the basket to compute its value. No having reference data readily available makes this a long and tedious job, a case to be made for having reference data on the wish list for the platform. Quite an expensive item on the list too, one easy to mushroom into an elaborate and expensive part of the system.

Algorithm to produce the basket list has a painful manual step that Quantopian could eliminate if the ticker (at time of the tick observed) were provided with sid data.

  1. Download a constituent list for the basket, e.g. Dow Jones tickers over time.
  2. Get the sid for each ticker. This is the manual step that needs to be done before compilation.
  3. For each tick, compute the value of the basket components and compare it to the trade price of the basket security.
  4. Make a decision based on data from step 3.

Step #1 can be obtained. I did not find the tickers in a few seconds I have spent on searching it but this data is public and at least Wikipedia has the company names.
http://en.wikipedia.org/wiki/Historical_components_of_the_Dow_Jones_Industrial_Average

@Vlatko, my first thought is you could create a CSV that contains the numeric sid identifiers for the DJIA (or any other basket). Maybe use negative numbers for removals. You'd still have to manually add the sid() references to your code somewhere.

Date,DJIA  
1896-05-26,3149 # GE  
1901-04-01,-3149 # GE removed  
1901-04-01,3971 # IP  
1901-07-01,-3971 # IP removed  
1901-04-01,8329 # X  
1907-11-01,3149 # GE re-added  
1915-03-16,3246 # GM  
1916-10-04,-8329 # X removed  
etc  

However I don't think that would play very nicely with the Fetcher. The dates prior to the backtest wouldn't get called properly. And you cannot forward fill the values since there's only one column.

A slight tweak of the above CSV would let us use the symbol recognition in Fetcher. This version would also let us forward fill the status variable (1 for added, 0 for removed)

Date,symbol,DJIA  
1896-05-26,GE,1 # GE  
1901-04-01,GE,0 # GE removed  
1901-04-01,IP,1 # IP  
1901-07-01,IP,0 # IP removed  
1901-04-01,X,1 # X  
1907-11-01,GE,1 # GE re-added  
1915-03-16,GM,1 # GM  
1916-10-04,X,0 # X removed  
etc  

You could add another column for the S&P 100 and put it all in the same CSV. Keep in mind you still require a list of sid() references in the code to bring the tickers into your 'universe' for the backtest. And there is a limit of 100 sid() references for each backtest.

Date,symbol,DJIA,SP100  
1896-05-26,GE,1,0  
1901-04-01,GE,0,0  
1901-04-01,IP,1,0  
1901-07-01,IP,0,0  
1901-04-01,X,1,0  
1907-11-01,GE,1,0  
1915-03-16,GM,1,0  
1916-10-04,X,0,0  

Grant, Vlatko - Not to throw a spanner in the works, but I don't see where we would be accounting for the weighting of each security within the ETF. If those change over time, it would entail having to re-jig the system every time (if changes are known to begin with)

Would it not be easier to just have the python equivalent of excels 'Get external data' function. Point that at the URL for the ETF, the summary pages always have the NAV listed in the same space.

@Gautam, you would have to roll your own CSV with NAV values and use the Fetcher function to get them into the backtester. See this thread on scrapy.

Dennis - Thanks. Am looking into that now.