Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Why the surge to implement social media data companies over economic data?

In my opinion, the list of data available is vastly incomplete. Whats the idea with providing loads of social media data from various vendors?

One could definitely use real estate data, commodity data, interest rate data, etc. for regime change measures.

I am not sure if high quality data is free. But as the future unfolds, social media data in my opinion will become quite irrelevant. The firm producing the most alpha should win out. Also, most of these firms offer a premium minute level feed which means most of the numbers being traded on have already been exposed to others before we get them.

This is a weird business where you do nothing to tell people you are accumulating a position, but once you have it promote the hell out of it so others pump up the price.

5 responses

Hi Miles,

It's a reasonable question. Beyond estimates from Estimize, macro data sets from Quandl and events from EventVestor, we do have a few sentiment focused vendors. In setting up this online marketplace, it takes two to tango. The earliest firms to participate tend to be those who are willing to try something new and different from their traditional market. Vendors like Accern, Sentdex and PsychSignal were among those firms.

We're always working on adding new data sets and filling out our data offering so if you have specific vendors in mind, I'm more than happy to investigate adding them. And if there are specific data sets from Quandl that cover some of the macro data sets in which you're interest, definitely let me what those would be (with our existing integration to Quandl, adding incremental Quandl sets tends to be easier).

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Quandl was the main provider I had in mind. Instead of incrementally adding datasets, is there talks about creating a setup where you can pull in any dataset available on quandl? And those who wish to pay for premium datasets can choose to do so? This would eliminate having to speak directly with those companies.

I suppose one could put quandl data into a csv then pull it in that way

If you have specific data sets from Quandl you'd like to see stored natively, please do I identify them here and we can work to put them into the system. There are advantages of our built-in processing: greater reliability in processing the data, better modeling of availability of the data in backtests (we generate a point in time database), better reliability in the availability of the data, etc. Further, algorithms that use the data we've processed (vs. data using fetcher) can be entered into the contest.

If it is a macro-economic data set from Quandl -- essentially a simple time series of a group of metrics, it is quite easy for us to add the set.

https://www.quandl.com/data/CHRIS/CBOE_VX1
https://www.quandl.com/data/CHRIS/CBOE_VX2

It looks like these have some bad data around 2008

https://www.quandl.com/data/CHRIS/OSE_NK2251

This one should be useful to see what happened over night in Japan.

https://www.quandl.com/data/CHRIS/EUREX_FESX1

This one should be useful to see what happened over night in Europe.

I imagine that it shouldnt be to hard to place the proper timestamps on these? They should be available before trading

Would be very useful whether to be able to put a hedging strategy on out of the gate if the markets dump 5%. Although it will not reflect what one can actually get in the market since trading would happen before market open, I think it is a good start