Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Market/Security prediction using Machine Learning classifier and Google Trend data

I have created this notebook that tries to machine learn patterns between keywords search volume as provided by Google Trend and a selected security and see if some alpha signal can be generated.

The data files that is used by the notebook can be downloaded from:

https://dl.dropboxusercontent.com/u/41007056/data.csv

Please provide your comments.

Thanks,

Luc

9 responses

Here is a backtest result using multiple securities.

@luc, I just came across your post. This is great work! Would you be willing to share your code on how you generated the data file?

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

This is really really cool!! Wondering if it can be fitted to a futures strategy..

@Jonathan, @Nicky => Thanks.

Here is a link to the python notebook, keyword list and some historical gt data. All the data file contains are the Google Trend values has received from G.

https://dl.dropboxusercontent.com/u/41007056/data-building-notebook.zip

The notebook can't run in research, but will run fine on a local machine. The above notebook should also be run on a local machine and the dataframe saved as a CSV. That CSV is in turn used by the Q algo to run the back test.

The main issue with this algo at this point is that there are not enough triggered shorts to have a statistically significant backtest.

Futures may be interesting to test, I'll have to look into it.

@luc.. Great idea to relate Google Trends and the stocks performance. As a newbie, I am trying to understand how the following works.

f_high_predictiveness = df[-1*int(context.number_of_positions_ratio*len(df)):]

Ashish,

Disregard this line of code. I was trying to rank predicted securities by their past "predictiveness", and that simply does not work.

/Luc

Again, (as Dropbox killed the public folder for 1TB users :( ):

Link to latest Google trend data:
https://www.dropbox.com/s/czhlnw8zycgxdfj/data.csv?dl=0

/Luc

Luc, can you link the data-building-notebook.zip for the user who is not 1TB users. Thanks.

Instead of trading directly a SPY ETF, I applied the algorithms result to a simple momentum algo (I just used Q's simple momentum algo). i.e. When the ML of GTs indicates calm waters ahead, the momentum algo is biased long. When the ML indicates potential market reversal, the momentum algo goes a little short. Here is the backtest. So, maybe the initially proposed algo does not stand on its own, but rather could be used as a long/short ratio signal.

/Luc