Stock prices reflect the trading decisions of many individuals. For the most part, quantitative finance has developed sophisticated methods that try to predict future trading decisions (and the price) based on past trading decisions. However, what about that the information gathering phase that precedes a trading decision? Two recent papers in Nature’s Scientific Reports suggest that Google searches and Wikipedia usage patterns contain signal about this information gathering phase that can be exploited in a trading algorithm. As is (unfortunately) very common, there is no published code in the paper that we can use to easily replicate the results. The algorithms are very simple, though, so I coded both of them on Quantopian. They indeed seem to perform quite favourably and thus roughly replicate the results of the paper, as you can see below. The original simulations have not included modeling transaction costs or slippage which we include here. In that regard, we can show that these strategies still seem to work under more realistic settings.
This algorithm looks at the Google Trends data for the word ‘debt’. According to the paper, that word has the most predictive power.
This data is not as easy to automate within Quantopian, but it’s relatively easy to do so manually. I downloaded the csv file and edited it to get it into the right format. I uploaded the resulting file here.
If you want to use my data on ‘debt’ feel free to do so. If you want to use a the Google Trend for a different word, you can download the CSV, edit it to look like mine, and place it in a public Dropbox or some other webserver.
If there is enough interest we can make this data more accessible (if you want to help me with this, an automated Python script that parses the csv returned by Google Trends to the format I posted would be well appreciated).
http://www.nature.com/srep/2013/130425/srep01684/images/srep01684-f3.jpg
For this algorithm, once the weekly average is smaller than the moving average of the delta_t (in this case delta_t == 5 weeks), we buy and hold the S&P500 for one week. If the weekly average is larger than the moving average then we sell and re-buy the S&P500 after one week. The original paper uses the Dow Jones Industrial Average, the S&P500 is highly correlated however.
Suggestions for improvement (please post improvements as replies to this thread):
- The authors used many different search queries, listed here . If you upload different queries in the same csv format as I did we can explore those as well.
- delta_t == 3 is what the authors of the paper used. It would be interesting to see how the algorithm performs when this is changed.
- The underlying algorithm is a very basic moving average cross-over. Certainly a more clever strategy might be able to do a much better job.