[Quantopian Update] - This algorithm is now outdated, please visit this thread to see recent examples of Accern's data with pipeline and Quantopian 2
Hello Quantopians,
We're at it again. We recently backtested over 600,000 news and blog articles related to company earnings (2.5 years length) with the help of Quantopian community members. The results are very interesting. Have a look for yourself in our backtest report below :)
The news and blog dataset is designed by Accern, a big data media analytics firm headquarter in New York City (Wall Street). We monitor over 20 million news and blog sources on the web in real-time and provide over 25+ fields for analytics designed specifically for quantitative trading. Accern currently serve some of the largest multi-billion AUM hedge funds worldwide.
The purpose of the backtest was to identify the performance of trading on earning information in real-time and also to identify which segment of earnings information generated the most returns. We found that Financial Ratings (a segment of Company Earnings) generated the most returns. We also wanted to identify the effect and performance of Acquisitions, Corporate Governance (management decisions), and Contracts (deals, partnerships) on stock prices and how well these types of events can be used in quant trading.
Our data set contains 16 Event Groups, 78 Event Types (Sub-Groups), 1000+ events, and 30K event variations.
We utilized the following events in our backtest:
- Company Earnings (Event Group)
- Financial Results (Company Earnings - Event Type)
- Financial Forecast (Company Earnings - Event Type)
- Financial Ratings (Company Earnings - Event Type)
- Acquisitions (M&A - Event Type)
- Corporate Governance (Event Group)
- Contracts (Event Group)
We utilized the following fields of analytics in our backtest:
Story Sentiment (-1 - 1): This metric calculated the aggregated sentiment score of a specific story.
- A positive sentiment score meant that the story was trending positively.
- A negative sentiment score meant that the story was trending negatively.
- This could be used as a directional trigger.
Article Sentiment (-1 – 1): This metric calculated the sentiment score of an article which was relevant to a company.
- A positive sentiment score meant that the article was written in a positive tone towards a company.
- A negative sentiment score meant that the article was written in a negative tone towards a company.
- This could be used as a directional trigger.
Event Impact Score on Entity (1-100): This metric calculated if the article would have a greater-than-1% impact on the stock on the same day.
- A high impact score meant that the article had a high probability of affecting the stock price by more than 1%.
- A low impact score meant that the article have a low probability of affecting the stock price by more than 1%.
- This could be used as a decision maker to execute an order / identify critical information to trade on.
Overall Source Rank (1-10): This metric calculated the timeliness and reposting of a source; could be used as a trust or viral factor.
- A high overall source rank meant that "source x" was usually the first at releasing articles and other sources usually reposted the same information after "source x" had posted it.
- A lower overall source rank meant that "source x" was usually late at releasing articles than other sources and other sources usually never reposted the same information after" source x" had posted it.
- This could be used as a trust filter to valid a story.
First Mention (TRUE/FALSE): This metric lets you know if a story hadn't been mentioned across 20 million sources within 2 weeks.
- TRUE meant that the story hadn’t been mentioned across 20 million sources within a 2-week period.
- FALSE meant that the story had been mentioned across 20 million sources within a 2-week period.
- This could be used as a quick decision maker to execute an order.
The backtest report explains it in more details. Please review the report and share it with anyone you like. We are currently in the process of working with Quantopian to make our historical data available on the platform and also working on a retail Alpha Stream feed (Alpha Stream Lite) which you can live trade on.
Request access by signing up here: Alpha Stream Lite Sign Up
Here is our backtest report: Accern Event-Driven Backtest Report
CSV File containing 600,000 news and blog articles we used: Accern News/Blog Data Set
Our previous backtest (Trend-Following Strategy): Quantopian Article and PDF Backtest Report
Contact me personally with any questions: [email protected]
Best,
Kumesh Aroomoogan
Co-Founder and CEO, Accern