Dear Community Members,
I am dedicating this thread specifically for hedge fund managers who are interested in Alpha Stream, Accern's institutional news and blog data product. I want to have a central location to preserve all our current and future algorithms and sample data to showcase to hedge funds in a seamless manner. This thread will be updated weekly or bi-weekly with new algorithms and data set to be demonstrated. Community members who are interested in pursuing a career in the hedge fund space will also benefit from this thread as well.
About Accern
Accern is the world’s first big data media analytics provider to deliver the most comprehensive dataset of actionable and authentic stories and analytics from over 20 million news and blog sources for quantitative trading. With cutting-edge machine learning, deep learning, and neural network algorithms, we have designed actionable trading metrics for the quantitative trading space. We currently utilized the data as a standalone model, but it can be applied very successfully with any multi-factor models.
Alpha Stream Trading Metrics
Sentiment Analysis
- Article Sentiment (1 to -1): Identifies the attitude the article is written in. This can be used as a directional signal.
- Story Sentiment (1 to -1): Tracks the aggregated sentiment for a specific story. This can be used as a directional signal.
- Average Day Sentiment (1 to -1): Aggregates article sentiment for a company each day. This can be used as a directional signal.
Rankings
- Overall Source Rank (1 - 10): Evaluates the credibility of a source based on its timeliness and re-post rate of releasing stories.
- Event Source Rank (1-10): Evaluates the credibility of a source based on its timeliness and re-post rate of releasing stories on specific events.
- Overall Author Rank (1 - 10): Evaluates the credibility of an author based on its timeliness and re-post rate of releasing stories.
- Event Author Rank (1-10): Evaluates the credibility of an author based on its timeliness and re-post rate of releasing stories on specific events.
Impact Analysis
- Overall Event Impact Score (1-100): Probability that an event will have a greater-than-1% impact on any stock.
- Entity Event Impact Score (1-100): Probability that an event will have a greater-than-1% impact on the mentioned stock.
Time and Exposure Analysis
- First Mention (TRUE/FALSE): Alerts you on unique story before they become exposed on the web.
- Story Saturation (Low/Mid/High): Tracks the online exposure rate of a specific story. This can be used as an enter and/or exit signal.
Hedge Fund S&P 500 Strategy A: Long / Short (Weighted)
We have conducted a backtest using a Long/Short strategy with weights. This is a weekly holding period strategy. Every Monday at 9:30 AM when the market opens, we would identify stocks that matches the criteria for our “bull” basket and enter those “bull” stocks into long positions. A stock will match our “bull” criteria if a trustworthy source released a positively-toned story with high probability of impacting its price. We will also identify stocks that matches the criteria for our “bear” basket and enter those “bear” stocks into short positions. A stock will match our “bear” criteria if a trustworthy source released a negatively-toned story with high probability of impacting its price. All positions are closed on Friday 3:45 PM.
ADDING WEIGHTS: We add a twist to this strategy by adding weights to some metrics. For example, the more positive the sentiment and higher the impact of an article that mentions a company, the more shares of that company we will buy. The more negative the sentiment and higher the impact of an article that mentions a company, the more shares of that company we will short.
NOTE: This backtest will take ~10 minutes to start due to the massive 5 million article history.
BACKTEST REPORT (PDF)
https://dl.dropboxusercontent.com/u/428478238/AlphaStream%20Backtest%20Report%20(LongShort).pdf
S&P 500 – Full Historical Data
1 CSV with over 5 million articles
https://www.dropbox.com/s/a7u4hkgeaep0l26/backtest_sp500.csv.gz?dl=0
8 CSV with over 5 million articles segregated
https://www.dropbox.com/sh/6plq6qgfiljmga4/AABxXz-Vv5XqZcRdjIm219_ja?dl=0
S&P 500 – Quantopian Backtest Data (Only 3 Metrics - Small File)
5 CSV with over 5 million articles segregated
https://www.dropbox.com/sh/9eem9dvd2hqmort/AAAg0bA3E3JNX5HNDwATciE1a?dl=0
Enjoy,
Kumesh Aroomoogan
Co-Founder and CEO, Accern
[email protected]