Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Adventures outside Quantopian

So I got interested in machine learning and unfortunately, amongst other things, Quantopian (quite understandably) doesn't support neural networks, which are generally regarded as being quite powerful. This started a few weeks of adventure outside of Quantopian's community and platform and I thought I'd share some of my findings here.

First of all, Quantopian's hoard of data is fantastic. If any of you haven't tried to get access to minute level data outside Quantopian, know that it's extremely difficult to do so without throwing ~$900 at a place like QuantQuote (which is the highest recommended provider I was able to find). The pipeline API is also super handy. Trying to integrate other data sources is a pain when you're doing it yourself.

Next up, if you get into neural networking or machine learning, there are four problems you might attempt to solve (they were the ones I attempted to solve anyway):

  • Predicting the next day's stock price
  • Predicting the stock price for the next few days/months
  • Predicting the change in price for the next few days/months
  • Simply training a network to make good trades, skipping out on the middleman decision making

I encountered many academic papers that targeted the first one. Their results were relatively easy to reproduce and when you plot a curve based on the output of one of these networks, you get something that looks wonderfully accurate. However I came to realize that even if you simply plot today's price every day, you're going to get a curve that looks close, as the prices don't change by significant amounts very often.

The real challenge is to predict the middle two, which as I discovered was rather more difficult. Eventually I gave up on this. I don't believe that predicting change is doable without:

  • A ton of premium data (I had to settle for daily data from Quandl). Getting the kind of historical data used in the academic papers (typically goes back to the 1960s) is very difficult and expensive (only really available to institutional investors). Getting access to more recent minute data (this is usually available back to ~1998) is doable but also very expensive.

    • External data that has decent prediction value. The Google Trends stuff going around is a good example of this. There are numerous academic papers using neural networks and machine learning making fantastic profits by extracting sentiment from news feeds and the like. I had a look around and (again), data was the problem. It's near impossible to get a large enough archive of news headlines to train a network on and most of the existing sentiment providers like Accern or FinSentS don't provide nearly enough data to be useful for comprehensive backtesting.

So eventually I moved on to the last task, training an algorithm to make trades directly. The best paper I came across was this, which is a neural network enhanced momentum strategy. It performed fantastically well in the paper but unfortunately I wasn't able to reproduce it as it used an obscure type of neural network layer that I didn't have time to implement from scratch.

After that, I moved away from Neural Networks for a bit. I came across a few things that don't seem to have popped into the Quantopian hivemind:

  • Candlestick patterns
  • OneOption
  • Zorro
  • Forex (not supported by Quantopian so seems natural)
  • Modern portfolio theory and efficient frontier

On candlestick patterns, I found a summary of a few indicators that seemed promising. The articles seemed to indicate lots of optimism about the effectiveness of these patterns so I grabbed a trial of QuantShare, which has a good number of them built in and backtested using EOD data. Unfortunately, the results, while not entirely useless, weren't fantastic. Most candlestick indicators (on their own) weren't able to make successful trades most of the time and while some were profitable (38% annual return was the highest), they also came with some harsh drawdown (38%). I have a feeling though that these might be combined with other information to produce a more profitable algo.

Next up is OneOption. I found this through my broker's marketplace. Basically, they have a "Stock Trader" product that managed to produce 50% returns last quarter. The strategy seems to be a rather simple case of pattern matching (price compression, breakout + volume) and looks like a prime fit for Quantopian. I've been messing with other things recently so I haven't had a chance to implement and backtest it myself but I thought I'd mention it in case anyone else wants a look.

Another cool algo trading community I found is the Zorro community. Although they have a bit more of a focus on Forex and commodities, they have some stuff applicable to us stock folks as well. For example there's an interesting approach to strategy validation written for it that might be handy for Quantopian algos.

And penultimately, I thought I'd bring up Forex. I'm unsure why - perhaps it's the crazy margin they have - but there seem to be a ton of algorithms with insane performance and people able to validate them.

Finally, modern portfolio theory and efficient frontier. There have been some discussions around this in the past but nothing has come up recently. Basically, there's a theory that given a bunch of assets, there is an optimal portfolio for a particular target volatility. You can see an example of what the frontier looks like here. I've attached a notebook showing a pyfolio tearsheet for an algorithm matching that portfolio as well. Performs pretty badly but might be a good basis for something else.

Anyhow, hopefully this information dump is useful or interesting to someone.

4 responses

Interesting research and overview, thank you for sharing!

I don't have a lot of experience reading research papers around trading, but the ones I saw (including the one about Momentum Trading here) do not seem to mention risk management part of the strategy. Predicting price movements is one thing, but what about logic for entry/exit, order size and stops? If the model is right in 90% of cases, how much money the rest of 10% will lose? Have you seen any papers around this subject in the 4th category of your list?

I am assuming that a trading algorithm should utilize the confidence of each predicted data point to assign appropriate risk for each trade, but that mapping on its own is a separate problem.

Also, I am not sure I understand the argument that predicting long term movements is better (more valuable) than short term movements. If you can predict highs and lows for each day, then by entering and exiting at those points each day you will get maximum possible return each day, which will be maximum possible return cumulatively as well. In contrast (and by definition) the maximum possible return obtained from, say, a single trade between weekly highs and lows will yield less.

I haven't seen much about that I'm afraid. The few I saw that did attempt to generate profit employed pretty basic strategies and usually only tried to prove that the network generated profit at all. I think most of them were concentrating on the neural network study more than the financial aspect. If I could reproduce their results I might do so and improve on them but reproduction is time consuming since they generally use obscure network layers that aren't preimplemented in most of my libraries (I generally use Keras).

And yes, the paper I studied the most ranked its network generated signals and traded the top 10, which the authors found to be most profitable. They also included a plot of what profit was generated at each particular network rating and you could see a curve.

As for short term being better than short term, I agree. Unfortunately, it's among the more difficult tasks, the availability of data making it especially challenging. I debated buying the historical minute data but this is more of a hobby for me and I couldn't justify the expense. And unfortunately, Quantopian and neural networks are essentially exclusive. Maybe one day they'll allow us to train Keras models using AWS machines or something but until then we're out of luck.

I'm not sure what you mean by "Quantopian doesn't support neural networks". There might not be a Quantopian library for them but they're relatively simple structures to implement yourself if you have taken any sort of machine learning coursework.

It's technically possible to run neural network code on Quantopian, yes, however the kinds of networks I find studied are generally more resource intensive than Quantopian allows. It's also possible to train the network outside Quantopian and bring the weights over but then you don't have access to minute data.