Google Search Terms predict market movements

That's a very interesting result Thomas.

I have attached a tweaked version of your backtest. This one starts with $100k and risks 75% of available margin every week. E.g. $150k the first week and then more or less depending on profit & loss.

If we look at the results yearly it becomes clear that this system performs fairly average except for being very strong in bear markets.

2004: algo 23% SPY 14%
2005: algo 10% SPY 7%
2006: algo 3% SPY 8%
2007: algo -15% SPY 5%
2008: algo 13% SPY -40%
2009: algo 137% SPY 26%
2010: algo 28% SPY 15%
2011: algo 41% SPY 6%
2012: algo 9% SPY 14%

Edit: more years, and starting in December the previous year (to calculate 5 week indicator)

I just looked at your CSV file. I think you have a slight time travel problem.

The date stamps in your CSV file are for the week ahead. Meaning you are getting the Google Trends search results prior to them being collected by Google.

You should instead use the date at the end of the week.

https://github.com/suryasev/unofficial-google-trends-api

I found a Python script for accessing Google Trends.

I wonder if we could plug this directly into our algos.

Hi Dennis,

Thanks for posting the margin version of the algorithm. It's also an interesting observation that algorithm seems to capitalize on bear markets. You're also correct about the time-shift. I'll upload a new csv file with the fixed dts.

The pyGTrends.py is certainly useful. However, I think it only fetches the csv but doesn't reformat it which we'll require.

Disclaimer

Here is a SED script that cleans up the Google Trends CSV.

# skip to Week report  
1,/^Week,/ {

  # process header for Week report  
  /^Week,/ {

    # rename Week column to Date  
    s/^Week,/Date,/

    # print header row  
    P  
  }  
  # delete skipped lines  
  D  
}

# remove start date (leaving end date in first column)  
s/^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9] - //

# remove row that does not have numeric value in second column  
s/^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9],[^0-9]*$//

# remove all lines afer Week report (match first blank line)  
/^$/,$ d

Grant Kiehne

Thanks Thomas,

I haven't dug into the papers, but might there be a bias here? For example, your backtest is using Google Trends data on the string "debt" over the 2008/2009 debt crisis, an extreme event. Is there any indication that this would be a viable strategy going forward from today, with a different search string? How would we have any idea what string would be predictive?

I've wondered if there were more conventional "big red flags" that all was not well in the financial world. One reference is http://www.efficientfrontier.com/ef/0adhoc/darkside.htm where William Bernstein comments (on a TIPS yield curve dated 3/8/2008):

"Conclusion: The debt markets are so out of whack that we are now at a point where credit risk is being rewarded more than equity risk, something that should never happen in a world where equity investors own only the residual rights to earnings. This cannot last for very long: either spreads will tighten rapidly, equity prices will fall rapidly, or both. (Or, chortle, earnings will grow more rapidly.) Stay tuned."

Grant

Grant Kiehne

http://www.newyorkfed.org/research/capital_markets/Prob_Rec.pdf

Thomas,

Here's an interesting reference:

Note the increasing predicted probability of a recession, starting in 2005-2006.

If you are interested in macro-trends, you might also have a look at the charts and data available here:

http://www.aheadofthecurve-thebook.com/

I read the book awhile back, and my take-away was that when consumers have money in their pockets, they spend it, driving corporate earnings and stock prices. My read is that presently, the U.S. government is putting money into pockets by financing artificially low interest rates (but I ain't no economist, so I could be way off).

Grant

Here's a python script that looks for Google Trends CSV files in an 'input' folder and writes modified CSV files to an 'output' folder. Both folders should already exist.

import glob

# process all CSV files in 'input' folder, write modified CSV to 'output' folder  
for filename in glob.glob("input/*.csv"):  
    # flag to skip input lines  
    skip = 1  
    # open input file  
    with open(filename) as f:  
       # open output file for writing  
       with open(filename.replace("input","output"), "w") as o:  
           # get all lines  
           lines = f.readlines()  
           # process each line  
           for line in lines:  
              # check for start of Week report  
              if skip and line.startswith("Week,"):  
                  # stop skipping lines  
                  skip = 0  
                  # output modified CSV header  
                  o.write( line.replace("Week,", "date,") )  
              elif not skip:  
                  # match blank lines  
                  if line == '':  
                     # skip remainder of input file  
                     skip = 1  
                     break  
                  else:  
                     # remove start date (leaving end date in first column)  
                     data = line.split(" ") # split on space  
                     if len(data) == 3:  
                         fields = data[2].split(",") # split on comma  
                         # count number of fields  
                         if len(fields) == 2 and fields[1] <> '':  
                            # numeric value is present, write row to output  
                            o.write( data[2] )  
                         else:  
                            # numeric value missing after comma, report done  
                            skip = 1  
                            break  
                     else:  
                         # input line doesn't conform  
                         skip = 1  
                         break

So I just re-ran the 'debt' keyword backtest using an up-to-date CSV file (using the end-of-week dates to avoid time travel).

Sadly it doesn't have the same punch as before.

Dennis: Thanks for doing that. The scripts will also be very helpful in automating this.

I also ran it with the correct timing and got similar (disappointing) results. What is curious is that the above (wrong) result seems to be pretty similar to the results of the paper. Not sure what we did wrong, the paper states:
"We use Google Trends to determine how many searches n(t – 1) have been carried out for a specific search term such as debt in week t – 1, where Google defines weeks as ending on a Sunday, relative to the total number of searches carried out on Google during that time." So it's pretty clear that my above code was not what the authors described.

Disclaimer

I had almost the exact same thought process. It's always disappointing to debunk a promising strategy. Hopefully we'll figure out a way to use the effort anyway.

I created a script to automate the downloading and cleaning up of the Google Trends CSV data here: https://gist.github.com/tlmaloney/5650699. Let me know if you run into any issues. I seem to have hit my Google Trends quota limit for number of queries from my IP address.

May 26, 2013

Thomas, code looks great -- thanks!

Do you know what the quota limit is? Certainly that'd make it very hard for us to integrate this.

Disclaimer

The paper by Preis, Moat, and Stanley has numerous flaws in it, and it seems unlikely that it can be replicated by anyone. Chief among them is the datamining bias of selecting the best of 100 search terms based on in-sample performance and expecting that same performance to be an unbiased estimate of future performance. (The authors evidently also do not understand how shorting works, or statistical hypothesis testing.) Even if you can reconcile the biases in their backtests, and somehow get the same Google Trends data that they do, it would be difficult to see through this selection bias.

Does anyone here success in replicating Figure 2? I think that they got 326% by summing returns every week. I tried many ways but failed to get the same result. Only with geometric return and delta=4 or 5, I can get 3.23 return. But in this case, profit should start with 100 not 0.

I found another interesting discussion for the paper in http://sellthenews.tumblr.com/

@Sangno: Are you trying a 1-to-1 replication? I'm also working on an IPython Notebook to do just that but I'm not confident enough yet. Maybe I should post it so that we can join efforts?

Disclaimer

I am so happy that you also was trying to replicate the paper. I used SAS package to test their paper, but for more clear understanding each step, I used Excel too. Here is my approach.

Sample data
We need two sample data; DJIT and Google trends for 'debt' term. I collected DJIT data from http://www.djaverages.com/?go=industrial-index-data&report=performance with index price history from Jan 5, 2004 to Feb 22, 2011. I drew a graph of DJIT and get the same result with Figure 1. I think that DJIT data doesn't have any problem issue.

I collected the trend of 'debt' from Google Trends, http://www.google.com/trends/explore?q=debt#q=debt&geo=US&date=1%2F2004%2086m&cmpt=q. I restricted my search range with 'debt' term, U.S. geography, and the period from Jan 2004 to Feb 2011. I downloaded this result with 'Download as csv' in the same site. Here are the first six rows.

Week debt
2004-01-04 - 2004-01-10 63
2004-01-11 - 2004-01-17 66
2004-01-18 - 2004-01-24 63
2004-01-25 - 2004-01-31 66
2004-02-01 - 2004-02-07 61
2004-02-08 - 2004-02-14 62

matching DJIT data and debt data
I merged debt data with DJIT by including the nearest date of debt to the date of DJIT. For instance, the end date of the first week is Jan 10, 2004, and this debt data is matched with the first occurrence of djit after Jan 10. Because of holidays and some other reasons, some weeks doesn't have transaction on Monday. So, we need to find the first transaction day after the end day of a week of the term data. Here is the first several lines.

Week debt sdate edate ddate djit
2004-01-04 - 2004-01-10 63 1/4/2004 1/10/2004 1/12/2004 10485.17702
2004-01-11 - 2004-01-17 66 1/11/2004 1/17/2004 1/20/2004 10528.6635
2004-01-18 - 2004-01-24 63 1/18/2004 1/24/2004 1/26/2004 10702.51163
2004-01-25 - 2004-01-31 66 1/25/2004 1/31/2004 2/2/2004 10499.18265
2004-02-01 - 2004-02-07 61 2/1/2004 2/7/2004 2/9/2004 10579.03279
2004-02-08 - 2004-02-14 62 2/8/2004 2/14/2004 2/17/2004 10714.88173
2004-02-15 - 2004-02-21 61 2/15/2004 2/21/2004 2/23/2004 10609.62473
2004-02-22 - 2004-02-28 62 2/22/2004 2/28/2004 3/1/2004 10678.14178

Moving average
When Google Trends report trends data, it seems not to confirm exactly on Sunday. Suppose that you collect trend data on Jan 11, 2004 for the week 2004-01-04 to 2004-01-10. On Sunday (Jan 11), you may see "Partial Data" in Google trend because Google trends didn't finalize trends data for the previous week. Note: Google uses a week from Sunday to Saturday. In page 2 of the paper, the authors states that "where Google defines weeks as ending on a Sunday.." I think this is not correct.

Because of the incomplete data on Sunday, the authors think it is necessary to make a moving average with the transaction of the last 3 days. Thus, we need to make an average with three days data. The first row could be just one, and the second row could be the average of the first and second. Here is the first few rows. debt3 indicates the average of three days.

Week debt sdate edate ddate djit debt3
2004-01-04 - 2004-01-10 63 1/4/2004 1/10/2004 1/12/2004 10485.17702 63
2004-01-11 - 2004-01-17 66 1/11/2004 1/17/2004 1/20/2004 10528.6635 64.5
2004-01-18 - 2004-01-24 63 1/18/2004 1/24/2004 1/26/2004 10702.51163 64
2004-01-25 - 2004-01-31 66 1/25/2004 1/31/2004 2/2/2004 10499.18265 65
2004-02-01 - 2004-02-07 61 2/1/2004 2/7/2004 2/9/2004 10579.03279 63.33333333
2004-02-08 - 2004-02-14 62 2/8/2004 2/14/2004 2/17/2004 10714.88173 63
2004-02-15 - 2004-02-21 61 2/15/2004 2/21/2004 2/23/2004 10609.62473 61.33333333
2004-02-22 - 2004-02-28 62 2/22/2004 2/28/2004 3/1/2004 10678.14178 61.66666667

delta (page 2)
delta = n(t) - debt3(t-1)

Here are the first few rows.

Week debt sdate edate ddate djit debt3 delta
2004-01-04 - 2004-01-10 63 1/4/2004 1/10/2004 1/12/2004 10485.17702 63 0
2004-01-11 - 2004-01-17 66 1/11/2004 1/17/2004 1/20/2004 10528.6635 64.5 3
2004-01-18 - 2004-01-24 63 1/18/2004 1/24/2004 1/26/2004 10702.51163 64 -1.5
2004-01-25 - 2004-01-31 66 1/25/2004 1/31/2004 2/2/2004 10499.18265 65 2
2004-02-01 - 2004-02-07 61 2/1/2004 2/7/2004 2/9/2004 10579.03279 63.33333333 -4
2004-02-08 - 2004-02-14 62 2/8/2004 2/14/2004 2/17/2004 10714.88173 63 -1.333333333
2004-02-15 - 2004-02-21 61 2/15/2004 2/21/2004 2/23/2004 10609.62473 61.33333333 -2
2004-02-22 - 2004-02-28 62 2/22/2004 2/28/2004 3/1/2004 10678.14178 61.66666667 0.666666667
2004-02-29 - 2004-03-06 61 2/29/2004 3/6/2004 3/8/2004 10529.4783 61.33333333 -0.666666667
2004-03-07 - 2004-03-13 61 3/7/2004 3/13/2004 3/15/2004 10102.89483 61.33333333 -0.333333333

trading and return
if delta(t-1) > 0 then take short position.
if delta(t-1> < 0 then take long position.
if delte(t-1) = 0 then no action.
I think the authors take no action for the tie break. When they explained transaction fees, they pointed out the maximum number of transaction is only 104. If they take short or long for the tie break, they should described the number of transactions is 104, rather than the maximum number.

For the short position, the return is log(p_t-1) - log(p_t), and for the long position, the return is log(p_t) - log(p_t-1).

Here are the first few rows.

Week debt sdate edate ddate djit debt3 delta ret
2004-01-04 - 2004-01-10 63 1/4/2004 1/10/2004 1/12/2004 10485.17702 63 0 0
2004-01-11 - 2004-01-17 66 1/11/2004 1/17/2004 1/20/2004 10528.6635 64.5 3 0
2004-01-18 - 2004-01-24 63 1/18/2004 1/24/2004 1/26/2004 10702.51163 64 -1.5 -0.016377051
2004-01-25 - 2004-01-31 66 1/25/2004 1/31/2004 2/2/2004 10499.18265 65 2 -0.019181035
2004-02-01 - 2004-02-07 61 2/1/2004 2/7/2004 2/9/2004 10579.03279 63.33333333 -4 -0.007576593
2004-02-08 - 2004-02-14 62 2/8/2004 2/14/2004 2/17/2004 10714.88173 63 -1.333333333 0.012759588
2004-02-15 - 2004-02-21 61 2/15/2004 2/21/2004 2/23/2004 10609.62473 61.33333333 -2 -0.009872009
2004-02-22 - 2004-02-28 62 2/22/2004 2/28/2004 3/1/2004 10678.14178 61.66666667 0.666666667 0.006437245
2004-02-29 - 2004-03-06 61 2/29/2004 3/6/2004 3/8/2004 10529.4783 61.33333333 -0.666666667 0.014020047
2004-03-07 - 2004-03-13 61 3/7/2004 3/13/2004 3/15/2004 10102.89483 61.33333333 -0.333333333 -0.04135678

Accumulating return
We have two ways to accumulating returns: summing from the beginning to the ending, and geometric returns (1+r)(1+r)....
In my case, I have the geometric returns 2.027 and the sum of the returns 0.824.

Here are the first and last few rows.
Week debt sdate edate ddate djit debt3 delta ret sret cret
2004-01-04 - 2004-01-10 63 1/4/2004 1/10/2004 1/12/2004 10485.17702 63 0 0 0 1
2004-01-11 - 2004-01-17 66 1/11/2004 1/17/2004 1/20/2004 10528.6635 64.5 3 0 0 1
2004-01-18 - 2004-01-24 63 1/18/2004 1/24/2004 1/26/2004 10702.51163 64 -1.5 -0.016377051 -0.016377051 0.983622949
2004-01-25 - 2004-01-31 66 1/25/2004 1/31/2004 2/2/2004 10499.18265 65 2 -0.019181035 -0.035558085 0.964756044
2004-02-01 - 2004-02-07 61 2/1/2004 2/7/2004 2/9/2004 10579.03279 63.33333333 -4 -0.007576593 -0.043134678 0.95744648
2004-02-08 - 2004-02-14 62 2/8/2004 2/14/2004 2/17/2004 10714.88173 63 -1.333333333 0.012759588 -0.03037509 0.969663103
2004-02-15 - 2004-02-21 61 2/15/2004 2/21/2004 2/23/2004 10609.62473 61.33333333 -2 -0.009872009 -0.040247099 0.96009058
2004-02-22 - 2004-02-28 62 2/22/2004 2/28/2004 3/1/2004 10678.14178 61.66666667 0.666666667 0.006437245 -0.033809854 0.966270918
2004-02-29 - 2004-03-06 61 2/29/2004 3/6/2004 3/8/2004 10529.4783 61.33333333 -0.666666667 0.014020047 -0.019789806 0.979818082
.... 2011-01-16 - 2011-01-22 63 1/16/2011 1/22/2011 1/24/2011 11980.51975 62.66666667 7.666666667 -0.011972994 0.835101399 2.050606652
2011-01-23 - 2011-01-29 67 1/23/2011 1/29/2011 1/31/2011 11891.93241 63 4.333333333 0.007421755 0.842523154 2.065825752
2011-01-30 - 2011-02-05 61 1/30/2011 2/5/2011 2/7/2011 12161.62995 63.66666667 -2 -0.022425689 0.820097466 2.019498187
2011-02-06 - 2011-02-12 61 2/6/2011 2/12/2011 2/14/2011 12268.19208 63 -2.666666667 0.008723994 0.828821459 2.037116276
2011-02-13 - 2011-02-19 67 2/13/2011 2/19/2011 2/22/2011 12212.79189 63 4 -0.004525986 0.824295473 2.027896317

Drawing a graph
I couldn't get 326% although the line is similar with Figure 2.

@Thomas -- Sorry I don't know what the quota limit is. But I may have been mistaken about my hitting the quota limit. I was relying on an error message from https://github.com/pedrofaustino/google-trends-csv-downloader which I was using to retrieve Google Trends CSV and I realized this may not be the underlying cause. I ended up refactoring my script to use the python package called mechanize, so it no longer use pedrofaustino's code, and I am no longer running into a quota limit. Update at https://gist.github.com/tlmaloney/5650699.

Based on https://support.google.com/trends/answer/87282?hl=en&ref_topic=13975, you have to be very careful about how you use Google Trends data as a signal. Since the data is scaled based on the peak level in the time series, the data is not a time series representing point-in-time observations. For instance, let's say we've observed raw (unscaled) data:

time,raw_level
1,0.3
2, 0.6
3, 0.4

would then be scaled to:

time,scaled_level
1,50
2,100
3,67

but if the next point is

time,raw_level
4,0.9

the time series gets rescaled:

time,scaled_level
1,33
2,67
3,44
4,100

You can derive a signal off of the percent changes, but not the levels themselves.

@Sangno: I started doing it in Python and pandas: https://github.com/twiecki/replicate_google_trends I'm pretty sure it's not correct yet but if you want to help out that'd be appreciated. I'm also happy to add anyone to the repo if you give me your github account name.

@Thomas: Good news regarding the quota limit. Code looks useful. The scaling certainly is an issue.

Disclaimer

Btw. you have to open the .ipynb file in the IPython Notebook http://ipython.org/notebook.html
You can also view it online here: http://nbviewer.ipython.org/urls/raw.github.com/twiecki/replicate_google_trends/master/goog_repl.ipynb

Again, this is very much work-in-progress and I would be very surprised if there weren't obvious bugs. I'll add more comments soon but let me know any questions or suggestions.

Disclaimer

@Thomas Understood. I created a fork and a separate analysis at https://github.com/tlmaloney/replicate_google_trends. That's probably all the work I will do, because I don't think this strategy has legs, for reasons mentioned in this conversation. As Grant mentions above, the search term 'debt' itself is a free parameter.

@Thomas: I put my excel file at https://github.com/leesanglo/Replicating_google_trends/ Because Google trends data is varying according to the search period and geographic area, how about we firstly replicate the paper exactly and check the results with the same dataset. I am wondering whether we get 326% by following their trading strategy and Google trends of debt term. If I can, I would like to help to replicate google trends. My account in github is leesanglo.

@Thomas: Great, thanks. I'll incorporate those changes. I'm actually now more interested in whether the results are reproducible to begin with.

@Sangno: I added you to the repo, thanks for offering to help out. Your analysis at the beginning already seems very promising so maybe we can replicate it in the IPython NB to make it easier to share and present. Note that we should probably include Thomas' changes first.

Disclaimer

Brent Groya

The sellthenews.tumblr.com blog post says this papers method of calculating cumulative returns on shorting introduces a (1−p(t+1)/p(t))^2 bias. How is this bias derived? and more importantly how much of this "bank error" could contribute to the difference in results that we've seen from the tests above?

May 29, 2013

The bias is equal to log(p(t)/p(t+1)) - log(2 - p(t+1)/p(t)), which is the 'corrupted' form of log returns minus the real form. Applying the Taylor expansion to both logs gives the order of the error. It turns out this bias is somewhat small, even for weekly returns: on the order of 1 or 2% a year. That said, you should not be able to replicate the Preis paper in Quantopian (unless Quantopian incorrectly deals with shorting, which I doubt). You should, in theory, be able to replicate Figure 2 just using pandas and python, and replicating the bad accounting around shorts.

The authors of the paper note that they sampled the Google Trends data several times, so it might not be deterministic! If that is the case, I predict you will have great difficulties replicating their work.

May 29, 2013

@Brent: I think that he might want to point out the difference between arithmetic return and logarithmic return. In general, the difference of return between two types of measures increase larger if the rate of return is larger and larger or smaller and smaller. For example, suppose you have an investment $100. If the stock price increases $200, the arithmetic return is 100% by (200-100)/100, but the logarithmic return is log(2) = 69.3% around 70%.

He pointed out that the gap is larger for the extreme increase or decrease. For example, if the above stock price falls into 0, then arithmetic return is (100-0)/100 = 0%, but logarithmic return is log(0) = infinitive. Then, in the short position, theoretically, investors get unlimited gains, but not unlimited loss.

I think the explanation of a positive bias is about the variance between two measures, not the absolute difference. Please refer http://www.cdiadvisors.com/papers/CDIArithmeticVsGeometric.pdf , you will see a basic formula.

@Shabby: I think that when we incorporate Google trends data into trading rules, the most crucial problem is that on Mondays Google trends still do not finalize trends for the previous weeks. On Mondays, Google Trends shows “partial data” for the previous week, but for the trends data before one or two weeks, trends data is fixed. Because trends data are normalized by the most frequent term for a given period, if we specify the same period and the same geographic area, we will get the same trend data before around two weeks. So, I think we should have around 300% return from a back test, even if we acknowledge the incomplete data for the last two or three weeks for Google trends.

May 30, 2013

I have emailed the authors for their Google Trends data. I will let you know if I hear back.

May 30, 2013

@Thomas: What is a coincidence! I also emailed to the corresponding author of the paper, Dr. Preis about the Google Trends data yesterday. If I get some news from the authors, I also let you know. If we do not get any news, I'm going to contact a journal editor.

May 30, 2013

Haha -- I also wrote the author a couple of days ago but didn't hear back yet.

Disclaimer

Jun 2, 2013

Maybe no one got news from the corresponding author of the paper. So, I have emailed editors in Scientific Reports about the paper. If I get some response, I let you know.

Jun 3, 2013

@Sangno: Great, let us know if you get any reply. Also, have you had a chance to look at the IPy NB? I'd be more comfortable if we could all agree that it's theoretically doing the right thing. Certainly input from the author would help tremendously...

Meanwhile there is another blog post referring to this thread over at sellthenews which I think hits the nail on the head in regards to replicability.

Disclaimer

Jun 7, 2013

Hello, guys.

At last, I got a response from a editor in the Journal. He will contact the corresponding author and required to make materials, data, and associated protocols available to readers on request. Before we request a necessary data, I think we need to replicate the paper exactly the same way and summarize our findings.

@Thomas: Could you make another program or modify the program with the same period data of debt term? They used Google trends data for debt term from January 5, 2004 to February 22, 2011, and with U.S. country only. Also, would you replicate with 'culture' term?

One way to verify the result of our program is whether we get 33% profits when we apply a 'Dow Jones strategy'. If we use DJIA instead of Google trend data for some terms, that is a Dow Jones strategy (p. 5). Please draw cumulative returns and compare the outcome with the authors' outcome. You can see the authors' outcome in the last page of the supplementary information (http://www.nature.com/srep/2013/130425/srep01684/extref/srep01684-s1.pdf).

Using the DJIA data, I got 32.36% profit for debt term. But, I think it seems that they employed (p(t) - p(t-1)) / p(t-1) formula rather than ln(pt) - ln(pt-1). For the buy and hold return, if I use the former formula, I got 16.47% but If I use the latter formula, I got 15.25%. If we have a time, it is better for us to write the report like http://arxiv.org/abs/1112.1051 paper. If I will get another news, I will let you know.

Jun 7, 2013

That's great, Sangno. Thanks for sharing. Lets hope he actually sends us the data.

I did some more work on the replication, fixing some bugs, adjusting the time range, etc:
http://nbviewer.ipython.org/urls/raw.github.com/twiecki/replicate_google_trends/master/goog_repl.ipynb

One thing I'm not sure about is that I should actually do the log calculation at the bottom the other way around. This, however, leads to losing strategy.

The paper looks very appropriate, I'll give it a read.

Disclaimer

Jun 12, 2013

Hi Sangno,

Tom and I got the same email as well. Its very forthcoming of the author I think. I looked at the R code and it looks good.

Tom started to compare the data sources here:
http://nbviewer.ipython.org/urls/raw.github.com/twiecki/replicate_google_trends/master/compare_goog_data.ipynb

It doesn't seem that a scaling would influence a moving average but now we can compare the signals directly which should help.

Disclaimer

Jun 12, 2013

I've looked briefly at the R code, need to think more about their accounting methodology. It's not exactly how they describe it in their paper. I think it's fishy. I'm going through a hypothetical price process. Let's say at week 1 the price is $100, week 2 it's $50. I start with $100 so I can buy or short 1 share in the beginning. Let's say I short at week 1. I end up with $150 after I buy back the shorted share. That's a 50% return, but by their methodology I would have gotten a 100% return. Maybe I'm wrong, but I think their short accounting is incorrect.

Jun 13, 2013

My general issue with all systems of this sort is that they have no plausible economic basis. There's no "story" of a market structure defect or behavioral bias that could account for the excess return, especially one this dramatic.

In the absence of that, all you are left with is either gross errors (look-ahead) in the analysis, or simple curve-fitting by another name...

Jun 13, 2013

@Simon, while I mostly agree with you there is a flip side. By looking at non-financial keywords we might be able to get insight into subtle shifts in consumer sentiment before it turns into changes in buying habits. Properly harnessed that could allow us to predict future changes in economic activity. Think of it as consumer market research. So perhaps it could be useful in predicting sector dominance.

Jun 13, 2013

Maybe, but I am very skeptical. Data mining 100+ terms, multiplied by whatever parameters they optimized for the moving averages is far to many degrees of freedom, compared with ~450 weeks of possible trade entries.

This seems to be the very definition of data mining. I suspect the work "tech" would have been equally impressive in the 1994-2000 timeframe had data been available.

EDIT: and as others have mentioned, if the results disappear when correcting a look-ahead bias, you can be confident the causality was never there, and people were just searching for reasons why the market dropped after they read that it did. What is more plausible:

1) prescient smart money managers search google for "debt", before placing big orders that presage market moves
2) retail masses search google for "debt" (as in "debt ceiling", "debt limit", "sovereign debt" etc), after reading about it in USA Today because that is what the business writers blamed the 10% correction on

Jun 17, 2013

I think that until now, it seems that their findings are just anecdotal in terms of high figures. The implementation of the paper contains double precision issue and calculating return of short selling issue. At the first time, I thought that it seemed that 326% return using 'debt' term in Google Trends was too high although we acknowledged that Google Trends have to some degree a relationship with Dow Jones.

In terms of their theoretical evidence, other studies support their findings. Mao (http://arxiv.org/pdf/1112.1051.pdf) compared the predicting ability of financial market among survey, news, twitter, and search engine data (i.e. Google Trends) from 2008 to 2011. Interestingly, the study found that DJIA and Dow Jones have relatively strong correlations with volatility (VIX), DJIA, and Volume. "The Google Insights for Search (GIS) time series have a positive correlation with the VIX and trading volumes, but negative correlations with DJIA, which indicate that as more people search on financial terms, their market will be more volatile (i.e. high VIX), and trading volumes will be higher, while DJIA prices will more lower."

Also, the study made Granger Causality Analysis between GIS and financial indicators. It turns out that Dow Jones Volume doesn't lead GIS, but GIS leads Dow Jones Volume until 2 weeks. So, the study set lag 2 week (i.e. Preis et al.'s paper set 3 week lags).

Choi and Varian (http://people.ischool.berkeley.edu/~hal/Papers/2011/ptp.pdf) studied the forecasting ability using Google Trends. Using AR1 model, they found that the MAE during the recession (2007/12 to 2009/01) is 8.86% without Google Trends, and 6.96% with Google Trends, and improvement of 21.4%.

Google Finance has already adopted the idea and provided the service using Google Trends. http://www.google.com/finance/domestic_trends

OK, I did more progress on the IPy NB and I can replicate the findings now: http://nbviewer.ipython.org/urls/raw.github.com/twiecki/replicate_google_trends/master/goog_repl.ipynb

Turns out even a simple strategy is very difficult to get correct as the devil really is in the detail. Quantopian makes this much easier in general (but see below for why it didn't work here).

My conclusions are as follows:
- There are some minor problems (shorting among other things) as Sangno pointed but nothing too major IMHO.
- Google Trends qualitatively changed their data from what Preis based their analysis on. Using more recent data the results are way less impressive. It'd be interesting to know why the data are actually different, it can't be a scaling issue. It does seem that Trends now reports an integer from 0-100 so that would allow differences to creep in.
- Currently Quantopian has an execuction delay of one bar, (in our case daily) so orders made on Monday will only get filled Tuesday thus changing the results.
- The author was very helpful by sending the code and data that allowed us to really track down the bugs in our code. Ideally that code would have been available somewhere online from the get go.

@Sangno: I really liked the Mao paper. Much more thorough analysis. The domestic trends also look cool and worth exploring!

Disclaimer

Interesting, thanks for the notebook. I'm curious about this line:

data['log_returns'] = data.order_preis * np.log(data.djia_preis.shift(-1)) - data.order_preis * np.log(data.djia_preis)

I'm not a python/pandas expert, but my reading of this is that the current bar's order determines what the return was last week? (aka - how is this shift(-1) not introducing future-snooping bias -- I must be misreading this code)

@Simon: Yes, that actually was a bug I had before (this is the fix). The idea is that we are buying (selling) at the price this week and selling (buying) at the price next week. This shift essentially achieves this (-1 will give next week's price). You are correct though that this calculation of returns wouldn't work in a walk-forward setting. But with a fixed strategy (i.e. selling (buying) after a week) that is adhered to no matter what this should be correct (I hope).

Alternatively I could do the returns calculation looking backward but the result should be identical. That would actually be more intuitive...

Does that make sense?

Disclaimer

I see. I had had terrible times avoiding future snooping when doing python like this, which is why it raised a red flag. Have you been able to replicate in a zipline?

EDIT: never mind :)

I agree. It's kinda funny -- I started out doing algorithmic trading with zipline/quantopian. Doing this by hand really highlights the difficulties and pitfalls of doing this manually (although pandas helps quite a bit). And this is a super simple strategy!

As to zipline (although you know the answer already ;) ): Not yet, but I think the order execution should be changed to allow for basic testing. I think slippage, commission, execution delay etc are critical components. But in my work-flow I try to get it to work without any complexities to make sure it's doing the right thing first. Then later I add those to see if and where it breaks down. Do others do this similarly?

Disclaimer

Yes, when I was trying to reconcile results between a pandas ipython session and a zipline, I turned off all the volume slippage and everything. I still needed to shift series forward in my pandas to align the "trade" dates with when the zipline would end up executing things. I finally got everything to match, but it was a long night.

So many of these academic systems use Market-on-Close and Market-on-Open orders, those will be a real help (if they haven't been implemented already, I haven't been following!). The trick is just to make sure that the event during which one can plan MOO orders has access to just the opening price.

I've run a simple analysis of all the search terms used in the original paper, and find that any 'effect' is simply due to data-mining bias (or 'selection bias'). I should note that while the 'backtest' I use is fairly crude, it is probably only biased in being mildly optimistic (free trades, free shorts, optimal position sizing, trading in an index, etc.). If it were the case that something looked significant, I would want to rerun it in a high-fidelity backtesting framework, but it is not needed in this case.

Excellent analysis! Printing this out for the subway ride home.

Excellent analysis! Printing this out for the subway ride home.

Jun 19, 2013

I updated my comment for the paper, and put it https://github.com/leesanglo/Replicating_google_trends/blob/830354b0bee81edbf13253128b8c67995cec110d/Replicating_PreisMoatStanely_ScientificReports_v2.pdf?raw=true

Ken Simpson

Jun 19, 2013

Interesting concept, but does not survive proper statistical analysis, and elimination of bias. First, the search term has to make sense. Using "debt" may provide a good signal for that period of time, it is skewed by the times, and should not be relied upon going forward. I would imagine you'd get similar results using the search terms "Justin Bieber" or "Kim Kardashian".

Search terms like "how to buy stocks", "best brokerage account", "hot penny stocks", or "mortgage my house to buy stocks" would be better correlated to the herd.

Ken Simpson
CrystalBull.com

Jun 23, 2013

@shabby shef: Great analysis indeed. I do think the wikipedia paper is more convincing in this way as it's more of a distributional analysis (similar to what you did) . I wonder if your analysis would look different for e.g. the financial terms.

Disclaimer

Jul 1, 2013

I suspect that the 'Sharpe' reported in the 'Risk Metrics' tab in these conversations is miscomputed by the zipline system. For example, the Sharpe reported in @Dennis C's first backtest above is 8.35, which seems far too high. Poking around the zipline risk module, I do not see the mean of returns computed anywhere, and so it seems that Sharpe is computed by dividing total returns by volatility, instead of mean returns. This is broken. Can anyone confirm this?

rex macey

Jul 2, 2013

When this paper came out I tested it in Excel which is easy since the data is weekly. I made a couple of modifications in the testing. Most importantly, I found that 'CNBC' was a better search term than 'Debt'. My rule was to short the SPY if the current week's search value was higher than the median of the last 11 weeks and to buy the SPY if the value was lower. The purpose of the median is to reduce the impact of extreme values. Even factoring in transaction costs my returns were something like 600+% compared with 40% for the ETF (cumulative returns). I am uncomfortable with this approach because of quirks I've noticed in the Google trends data. If you download the data for the same search term a few weeks later it will change. I expected some change due to scaling which wouldn't be an issue. However it's messier than that. In one download the value may rise from week x to x+1 and in a download a few weeks later it will fall. That doesn't make a lot of sense to me. I'd appreciate any insight you have into the workings of Google Trends. I apologize for not posting the CSV. This is my first day in Quantopia and I'm just figuring my way around. Also, it appears I'm going to have to learn Python.

Brent Groya

Jul 2, 2013

I have noticed such changes in the historical data too. Whatever normalization process they run also adjusts the historical values in some way other than simply scaling the values to 0-100. The documentation page is pretty vague. They only mention that the data is divided by some common variable.

Simon Chiu

Jul 4, 2013

Instead of looking for search terms to apply retrospectively, what if one cross checks the data for "trending" terms on Google with some filtering (eg exclude "kim khardashian"). I know there are companies which do this with twitter and reuters/bloomberg already but don't know about their success rate.

From my experience as a short term (<1 month) trader information dissemination is an inverse exponential process. Very little know about it, then some people start talking about it, then more and more until it hits the front page of every national newspaper. This dynamic is directly reflected in market volatility leading up to an event or events. I'm speculating it should be possible to determine a range for that inflexion point. Recent examples are Greece, Cyprus and Fukishima.

Alex Kovalenko

Jul 6, 2013

I am new to the site - just joined yesterday - playing around with the Google trends script to use "recession" as keyword instead. Not quite sure if I;ve got dates right. Thanks!

carl fischer

Jul 11, 2013

Excellent posts, thanks for sharing, only just joined Quantopian.

Jul 18, 2013

A new analysis of Google Trends based on the Preis paper was uploaded to Arxiv:
http://arxiv.org/abs/1307.4643

Predicting financial markets with Google Trends and not so random keywords

Challet Damien, Bel Hadj Ayed Ahmed

We check the claims that data from Google Trends contain enough data to predict future financial index returns. We first discuss the many subtle (and less subtle) biases that may affect the backtest of a trading strategy, particularly when based on such data. Expectedly, the choice of keywords is crucial: by using an industry-grade backtesting system, we verify that random finance-related keywords do not to contain more exploitable predictive information than random keywords related to illnesses, classic cars and arcade games. We however show that other keywords applied on suitable assets yield robustly profitable strategies, thereby confirming the intuition of Preis et al. (2013)

Disclaimer

Jan 15, 2014

Hello Thomas !
I have a question : did you test your algorithm with Preis' googleTrends data (i mean the exact same data) ? if yes, how did you manage to get it ?

Thanks !

Jan 21, 2014

Hi Benoit,

Yes, Tobias Preis was kind enough to send the data on which the paper was based. All we had to do was ask. You can find our replication here:
http://nbviewer.ipython.org/github/twiecki/replicate_google_trends/blob/master/goog_repl.ipynb

And the full repo containing the data here: https://github.com/twiecki/replicate_google_trends

Thomas

Disclaimer

Jan 21, 2014

Thank you very much Thomas ! it'll be very helpful for our researches.

Benoît

Jan 21, 2014

I also updated the repo with a readme now.

Disclaimer

Feb 21, 2014

In your calculations did you count some (virtual) fees for a broker for instance ?

Benoît

Feb 21, 2014

The Quantopian version by default simulates things like transaction cost, slippage, and order delay (and the strategy is quite sensitive to that it seems). The IPython replication does not.

Disclaimer

Nathan Oelke

Feb 24, 2014

Thomas Wiecki, I am a high schooler writing an english paper on algorithmic trading. I have figured that you seem to know a lot about the topic. If you wouldn't mind answering some interview questions, send me an email at [email protected] Thanks

Feb 24, 2014

Nathan, Sounds great. Just responded to your email.

Disclaimer

uni joy

Mar 4, 2014

@Thomas in repo there is same code i can't understand ,could you help me

data['rolling_mean'] = pd.rolling_mean(data.debt, delta_t).shift(1)  
    data['rolling_mean_preis'] = pd.rolling_mean(data.debt_preis, delta_t).shift(1)

    data['order'] = 0  
    data['order'][data.debt > data.rolling_mean.shift(1)] = -1 # is this a bug for shift .as you already shift  
    data['order'][data.debt < data.rolling_mean.shift(1)] = 1 #  
    data['order'].ix[:delta_t] = 0

    data['order_preis'] = 1  
    data['order_preis'][data.debt_preis > data.rolling_mean_preis] = -1 # no shift ?  
    data['order_preis'][data.debt_preis < data.rolling_mean_preis] = 1 #  
    data['order_preis'].ix[:delta_t] = 0

my question is rolling_mean is shift when it is create,but you shift in this line
data['order'][data.debt > data.rolling_mean.shift(1)] = -1 and not shift in this line
data['order_preis'][data.debt_preis > data.rolling_mean_preis] = -1

Mar 4, 2014

This might be a left-over. I don't think the 'order' signal is used anywhere. Instead 'order_preis' is the actual replication. It should thus be safe to delete the middle code section. But yeah, you are right, it makes no sense to shift it twice!

Feel free to submit a pull request that fixes the issue.

Disclaimer

Jason Swearingen

Mar 4, 2014

Thomas, thanks for posting this and keeping it updated, especially that follow up paper!

Mar 11, 2014

Thomas (or anyone who would know), do you know if Preis' NAV are available ? to have a look at it otherwise than with his debt curve.
Benoît

carl fischer

Mar 25, 2014

Seen this?:
Do Google Trend Data Contain More Predictability than Price Returns?
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2405804

May 7, 2014

The csv file was removed so here is an updated version which also uses the original Preis data for the debt search word.

I also refactored the code a bit to clean it up.

Disclaimer

Jul 31, 2014

There's a follow-up paper by the same group: http://www.pnas.org/content/early/2014/07/23/1324054111.short

Quantifying the semantics of search behavior before stock market moves

Technology is becoming deeply interwoven into the fabric of society. The Internet has become a central source of information for many people when making day-to-day decisions. Here, we present a method to mine the vast data Internet users create when searching for information online, to identify topics of interest before stock market moves. In an analysis of historic data from 2004 until 2012, we draw on records from the search engine Google and online encyclopedia Wikipedia as well as judgments from the service Amazon Mechanical Turk. We find evidence of links between Internet searches relating to politics or business and subsequent stock market moves. In particular, we find that an increase in search volume for these topics tends to precede stock market falls. We suggest that extensions of these analyses could offer insight into large-scale information flow before a range of real-world events.

Disclaimer

Hunter Johnson

Aug 4, 2014

Here is the backtest result for the first algorithm using Google Trends CSV data for "bankruptcy" instead of "debt". Not quite as good but still exceptional bear market performance and an interesting comparison. One of the few decent keywords I have found so far following the same/slightly modified strategy. (tough finding positive correlations with individual equities/ETFs)

Aug 5, 2014

Hunter, thanks for sharing!

It would be interesting to test the political keywords they argue to be predictive in their last paper.

Disclaimer

Paul Phau

Aug 10, 2014

I'm still learning python so i'm trying to figure this out, is there a way to use multiple keywords at the same time? You could use both positives and negatives as triggers. (like 'bankruptcy' and 'growth')

JOHN CHAN

Maybe this is just due to.. leverage effect... since small order size the performance degrades... some even went to negative.. territory if the performance is not that strong... and you reduce order size.

Théo Fabry

IMO a major problem is data inconcistency of Google Trends. I have found examples (I will document them if you want), where trend between 2 dates is opposed (ie on one period of time, the interest for a keyword is increasing between 2 dates, while on an other, it's decreasing).

This strategy would work way better with absolute values, but Google seems not interested in making them public.

James Crane-Baker

http://www.ecb.europa.eu/pub/pdf/scpsps//ecbsp9.en.pdf?177000b829d4450b007f3d3a612cab18

Google search terms is literally the holy grail. Doubtful they ever release the full data set but we can dream.

There's also this recent ECB paper that discusses the predictability of google search terms.

http://blogs.wsj.com/economics/2015/07/14/social-media-sentiment-presages-market-moves-ecb-paper/

Théo Fabry

Hi James,

I'm going to read the paper you linked, thank you. My opinion was based on Preis works. Even if its strategy works well in backtesting, Google always display it results with a delay of one or two days. For example, let's look at the keyword "debt" on a daily basis : https://www.google.com/trends/explore#q=debt&date=today%201-m&cmpt=q&tz=Etc%2FGMT-2

We can see here that latest value is wednesday. While this has no importance when backtesting (at we're looking at the past, data is available), my concerns are about the application on real, day-to-day, live trading.

Maybe you've heard of a method to solve this problem ? In the meantime, I'm going to read the report.

JOHN CHAN