Many of you might've already heard of Post Earnings Announcement Drift (PEAD). PEAD is a topic we've explored before. PEAD occurs subsequent to an earnings announcement. In a PEAD, a stock's price of the company moves in the direction of the earnings surprise. So if the actual earnings announcements are better than what Wall Street analysts guessed it would be, the stock's price is going to go upwards in that direction and vice versa if announcements were worse than Wall Street estimates.
That movement doesn't only happen immediately after an earning surprise. The stocks continue to drift in the direction of the surprise for days afterwards. There are opportunities to profit from this drift.
This got us to thinking. Are their other types of drifts? So we got together with our friends at EventVestor to talk about this possibility. Are there other events (like an earnings surprise event) that could provide alpha?
EventVestor aggregates event-driven data to provide a multitude of analytics services on it. For example they have event and announcement dates for:
In its most practical form, we can grab data that contains a list of companies, the dates that they've announced their events (like share buybacks), the dates that you should be trading on it, and other relevant data about the event.
With our curiousity piqued regarding other types of announcement price drift, EventVestor graciously provided a large set of sample data for one type of event: share buyback announcements) and we set out to answer the question: Can we create an effective strategy with share buyback data?
We'll go through two major steps in the rest of this notebook:
Let's get going.
Let's load up the data and explore the various attributes we get from EventVestor.
Here's the general format of data from EventVestor:
Sample Data Format:
| Ticker | Event Date | Sector | Trade Date | Event Type | Percent of shares |
"""
Let's load in the CSV
"""
import matplotlib.pyplot as pyplot
import seaborn as sns
#: EventVestor data
ev_data = local_csv('event_vestor_data_complete.csv', raw=True)
# Rows in our data set look like this:
ev_data.iloc[0]
"""
Plotting a histogram
"""
ev_data['pct_of_tso_tobuy'].hist(bins=30)
pyplot.title("Frequency of % of total shares bought")
pyplot.xlabel("% of shares bought")
pyplot.ylabel("Frequency")
pyplot.grid(b=None, which=u'major', axis=u'both')
The histogram above shows that stocks typically buyback around 2 ~ 5% of their total shares when they announce a share buyback with the next common being around 5% ~ 7.5%. It'll be interesting to see whether or not the percentage of shares bought back affect the drift of a security after purchase.
import pandas as pd
import numpy as np
"""
Compute the percentage of each type of share buyback
"""
# Series.value_counts() gives you the frequency count of all the unique items in that column
freq = ev_data.buyback_type.value_counts()
# Grab the total and finding out the percentages
total = freq.sum()
percentage = (freq/(total + 0.0))*100
# Consolidate Suspends and Reductions into a single column for visual purposes
percentage['Suspensions and Reductions'] = percentage['Suspends'] + percentage['Reduction']
percentage = percentage.drop(['Suspends', 'Reduction'])
# Reverse the columns on percentage
percentage = percentage.iloc[::-1]
# Create a horizontal histogram
pyplot.barh(range(len(percentage)), percentage.values, align='center', alpha=0.4)
# Add labels and annotations
pyplot.xlabel('Percentage')
pyplot.yticks(range(len(percentage)), percentage.index)
pyplot.title('Types of Buyback')
pyplot.grid(b=None, which=u'major', axis=u'both')
for idx, pct in enumerate(percentage.values):
pyplot.annotate(" %0.2f%%" % pct, xy=(pct , idx), va='center')
sns.despine(left=True)
About 98% of all total share buybacks are either when a company has announced a new share buyback program or an additional one. And remember from before that the majority of our stock buybacks are within the 0~7.5% (2~5 + 5~7.5) range, so we should try and look at what percentage of shares consist of new share buybacks and what percentage of shares consist of additional share buybacks.
buyback_types = ['New', 'Additional']
fig = pyplot.figure()
# Add subplot for New Buybacks
ax = fig.add_subplot(2, 1, 1)
new_buyback_buy_percents = ev_data.pct_of_tso_tobuy[ev_data.buyback_type == 'New']
sns.distplot(new_buyback_buy_percents, ax=ax)
pyplot.title("New Buybacks")
pyplot.xlabel("Percent of shares bought")
pyplot.ylabel("Frequency")
pyplot.grid(b=None, which=u'major', axis=u'both')
# Add subplot for Additional Buybacks
ax = fig.add_subplot(2, 1, 2, sharex=ax)
new_buyback_buy_percents = ev_data.pct_of_tso_tobuy[ev_data.buyback_type == 'Additional']
sns.distplot(new_buyback_buy_percents, color='green', ax=ax)
pyplot.title("Additional Buybacks")
pyplot.xlabel("Percent of shares bought")
pyplot.ylabel("Frequency")
pyplot.grid(b=None, which=u'major', axis=u'both')
fig.subplots_adjust(wspace=.35, hspace=.4)
sns.despine(left=True)
In both cases, it looks like the 0 ~ 7.5% range makes up the bulk of the magnitude for shares bought back for both new and additional.
"""
Retreiving the frequency and percentage of the sectors
"""
freq = ev_data.sector.value_counts()
total = freq.sum()
percentage = (freq / (total + 0.0)) * 100
percentage = percentage.iloc[::-1]
pyplot.barh(range(len(percentage)), percentage.values, align='center', alpha=0.4)
# Add labels and annotations
pyplot.xlabel('Percentage')
pyplot.yticks(range(len(percentage)), percentage.index)
pyplot.title('Percentage of Buybacks per Sector')
pyplot.grid(b=None, which=u'major', axis=u'both')
for idx, pct in enumerate(percentage.values):
pyplot.annotate(" %0.2f%%" % pct, xy=(pct , idx), va='center')
sns.despine(left=True)
Above, it looks as though the Financial and Services sectors constitute around ~50% of all total buybacks with Technology, Healthcare, Consumer Goods, Industrial Goods, and Basic Materials making up the rest of the ~50% that's left. So, going off of this, we're again curious to ask, does the sector affect whether or not the company performs a new
versus additional
buyback? And within each sector, are there certain sectors that perform bigger buybacks than others?
We'll be looking at only the top three sectors (Financial, Services, and Technology) to help narrow our scope to the most common and perhaps the most influential.
"""
Plotting the new/additional pie chart for each of the three major sectors, financial, services, and technology
"""
all_sectors = ['FINANCIAL', 'SERVICES', 'TECHNOLOGY']
# Setting up our plotting
fig = pyplot.figure()
colors = sns.color_palette('deep', 3)
for count, sector in enumerate(all_sectors):
"""
Grabbing the percentage of each type of share buyback
"""
# Filtering out for where the ev_data = the current sector
temp_data = ev_data[(ev_data.sector == sector)]
# DataFrame.value_counts() gives you the frequency count of all the unique items in that column
freq = temp_data.buyback_type.value_counts()
# Remove "suspends" and "reductions"
if len(freq.index) > 3:
freq = freq.drop(['Suspends', 'Reduction'])
total = freq.sum()
percentage = (freq / (total + 0.0)) * 100
percentage = percentage.iloc[::-1]
ax = fig.add_subplot(3, 1, count + 1)
ax.barh(range(len(percentage)), percentage.values, align='center', alpha=0.8, color=colors)
pyplot.yticks(range(len(percentage)), percentage.index)
pyplot.title(sector)
pyplot.grid(b=None, which=u'major', axis=u'both')
for idx, pct in enumerate(percentage.values):
pyplot.annotate(" %0.2f%%" % pct, xy=(pct , idx), va='center')
fig.subplots_adjust(hspace=.5)
sns.despine()
pyplot.xlabel("Percentage");
In each of the top three sectors, new
buy backs constituted the majority of all buybacks with the maximum being close to 70% for Financial and the minimum being around 58% for Services. So it looks like on the whole, all sectors perform more new
buybacks than additional
.
"""
Plotting the percentage of shares histogram for the three major sectors:
financial, services, and technology
"""
all_sectors_names = ['FINANCIAL', 'SERVICES', 'TECHNOLOGY']
colors = ['blue', 'green', 'red']
fig = pyplot.figure()
for i, sector in enumerate(all_sectors_names):
if i != 0:
ax = fig.add_subplot(3, 1, i + 1, sharex=ax)
bins = 20
else:
ax = fig.add_subplot(3, 1, i + 1)
bins = 33
temp_data = ev_data[(ev_data.sector == sector)]
temp_data['pct_of_tso_tobuy'].hist(label=sector, alpha=.4, bins=bins, ax=ax, color=colors[i])
pyplot.title("Percentage Buybacks for %s sector" % sector)
pyplot.xlabel("Percent of shares bought")
pyplot.ylabel("Frequency")
pyplot.grid(b=None, which=u'major', axis=u'both')
fig.subplots_adjust(wspace=.35, hspace=.6)
It seems that in all three of the major sectors, the 0 ~ 7.5% range still constitute the majority of buybacks. This makes sense because the these three sectors make up the majority of all new and additional share buybacks.
So far, we've gone through the data quite extensively and have asked a lot of questions regarding the buyback types, the percentage of shares bought back, and the sectors that each company belongs to. I'm going to briefly summarize what I've done before we begin looking at how these different factors can influence the drift after a share buyback announcement:
Having explored the data, we now have a chance to create a post-buyback announcement drift strategy and optimize it. We'll actually be comparing the different backtests using our get_backest
method for each of the different observations we found above:
Finally, we'll see which of these factors have the greatest impact on drift and whether or not isolating these factors results in a greater drift than if we were to trade on the information alone.
The algorithm was created in the Quantopian IDE and tested with the Quantopian backtesting engine. An excerpt from the description of the algorithm describes the approach:
"
We'll be capturing the post share buyback announcement drift by entering into a position of a company the trade date following the announcement, holding the position for 7 days, and exiting after 7 days. All positions will be long.
A few notes:
- We're hedging our positions based on a beta hedge
- You can change any of the variables found in initialize() to tinker with the different holding periods and the number of days used for the Beta Hedge Calculation
- EventVestor conveniently attaches a column called 'Trade Date' that indicates which day we should be trading our data."
Each of the different algorithm scenarios here will be filtering for the [criteria_name] in making buy decisions. So when we analzye 10~20% versus 2.5~7.5% it means we're comparing the algorithm from above but only buying when buybacks are between 10 and 20 percent versus only buying when buybacks are between 2.5~7.5%.
"
"""
Comparing Percent Buybacks
The get_backtest method allows you call a backtest ID and it will return to you all the different
metrics you'd find when you run a "Full Backtest".
You can access those by calling get_backtest.[hit tab here to bring up options]
A few of the ones you'll see here are:
- Cumulative Performance - For things like Ending Portfolio Value
- Risk - For things like Sharpe, Drawdown
- Daily Performance - For things like Daily Return
"""
#: 10-20%
higher_range = get_backtest('54ce9b59e8885b4850e792de')
#: 2.5-7.5%
lower_range = get_backtest('54ce9b7db13ee10d3dc6591c')
higher_range.cumulative_performance.ending_portfolio_value.plot(label="10-20%")
lower_range.cumulative_performance.ending_portfolio_value.plot(label="2.5-7.5%")
pyplot.title("Returns of Percentage Shares Bought Back")
pyplot.legend(loc='best')
pyplot.ylabel("Portfolio Value in $")
pyplot.xlabel("Time")
pyplot.grid(b=None, which=u'major', axis=u'both')
sns.despine()
"""
Comparing New Versus Additional Buybacks
"""
new = get_backtest('54ce9b9ea134720d2a247395')
additional = get_backtest('54ce9bb54200430d3e5e6dcc')
new.cumulative_performance.ending_portfolio_value.plot(label="New")
additional.cumulative_performance.ending_portfolio_value.plot(label="Additional")
pyplot.title("Returns of New versus Additional Buybacks")
pyplot.legend(loc='best')
pyplot.ylabel("Portoflio Value in $")
pyplot.xlabel("Time")
pyplot.grid(b=None, which=u'major', axis=u'both')
sns.despine()
"""
Comparing Major Sectors (Financial, Services, Technology) versus Minority (Non-major sectors) and
against ALL sectors
"""
major_sectors = get_backtest('54ce9c5f89b6420d367cf1bf')
minor_sectors = get_backtest('54cfbe5249305b0d4c8f1458')
all_sectors = get_backtest('54cf85eb11b99a0d3f1c7800')
major_sectors.cumulative_performance.ending_portfolio_value.plot(label="Major Sectors")
minor_sectors.cumulative_performance.ending_portfolio_value.plot(label="Minor Sectors")
all_sectors.cumulative_performance.ending_portfolio_value.plot(label="All Sectors")
pyplot.title("Returns of Sectors")
pyplot.legend(loc='best')
pyplot.ylabel("Portoflio Value in $")
pyplot.xlabel("Time")
pyplot.grid(b=None, which=u'major', axis=u'both');
sns.despine()
Now, we're going to take the seven backtests we got from above and analyze them all in one place by looking at their:
"""
Analzying Sharpe Ratios
"""
sharpe_ratios = {}
#: Here, we're taking the last sharpe ratio (which is an annualized total) and plotting that
sharpe_ratios['Higher Range'] = higher_range.risk.sharpe[-1]
sharpe_ratios['Lower Range'] = lower_range.risk.sharpe[-1]
sharpe_ratios['New Buybacks'] = new.risk.sharpe[-1]
sharpe_ratios['Additional Buybacks'] = additional.risk.sharpe[-1]
sharpe_ratios["Major Sectors"] = major_sectors.risk.sharpe[-1]
sharpe_ratios["Minor Sectors"] = minor_sectors.risk.sharpe[-1]
sharpe_ratios["All Sectors"] = all_sectors.risk.sharpe[-1]
#: Some label creations for the horizontal bar graphs
labels = sorted(sharpe_ratios.keys(), key=lambda x: sharpe_ratios[x])
y_pos = np.arange(len(labels))
sharpes = [sharpe_ratios[s] for s in labels]
pyplot.barh(y_pos, sharpes, align='center', alpha=0.8)
pyplot.yticks(y_pos, labels)
pyplot.title("Sharpe Ratios")
pyplot.grid(b=None, which=u'major', axis=u'both')
"""
Analyzing Average Daily Returns
"""
avg_return = {}
avg_return['Higher Range'] = higher_range.daily_performance.returns.mean()
avg_return['Lower Range'] = lower_range.daily_performance.returns.mean()
avg_return['New Buybacks'] = new.daily_performance.returns.mean()
avg_return['Additional Buybacks'] = additional.daily_performance.returns.mean()
avg_return["Major Sectors"] = major_sectors.daily_performance.returns.mean()
avg_return["Minor Sectors"] = minor_sectors.daily_performance.returns.mean()
avg_return["All Sectors"] = all_sectors.daily_performance.returns.mean()
return_labels = sorted(avg_return.keys(), key=lambda x: avg_return[x])
return_y_pos = np.arange(len(return_labels))
avg_returns = [avg_return[s]*100 for s in return_labels]
pyplot.barh(return_y_pos, avg_returns, align='center', alpha=0.8)
pyplot.yticks(return_y_pos, return_labels)
pyplot.xlabel("% Daily Return")
pyplot.title("Average Daily Returns")
pyplot.grid(b=None, which=u'major', axis=u'both')
"""
Comparing Max Drawdown
"""
drawdowns = {}
drawdowns['Higher Range'] = higher_range.risk.max_drawdown.iloc[-1]
drawdowns['Lower Range'] = lower_range.risk.max_drawdown.iloc[-1]
drawdowns['New Buybacks'] = new.risk.max_drawdown.iloc[-1]
drawdowns['Additional Buybacks'] = additional.risk.max_drawdown.iloc[-1]
drawdowns["Major Sectors"] = major_sectors.risk.max_drawdown.iloc[-1]
drawdowns["Minor Sectors"] = minor_sectors.risk.max_drawdown.iloc[-1]
drawdowns["All Sectors"] = all_sectors.risk.max_drawdown.iloc[-1]
drawdown_labels = sorted(drawdowns.keys(), key=lambda x: drawdowns[x])
drawdown_y_pos = np.arange(len(drawdown_labels))
drawdown = [drawdowns[s]*100 for s in drawdown_labels]
pyplot.barh(drawdown_y_pos, drawdown, align='center', alpha=0.8)
pyplot.yticks(drawdown_y_pos, drawdown_labels)
pyplot.xlabel("% Drawdown")
pyplot.title("Max Drawdown")
pyplot.grid(b=None, which=u'major', axis=u'both')
"""
Here, we're taking the three graphs we found above and putting them all in one place using Matplotlib's
subplots.
"""
print "Strategy Parameter Comparisons"
fig = pyplot.figure()
#: 3, 1, 1 means leave space for 3 rows, 1 column, and X position
ax = fig.add_subplot(3, 1, 1)
ax.grid(b=False)
ax.barh(return_y_pos, avg_returns, align='center', alpha=0.6, color='green')
pyplot.yticks(return_y_pos, return_labels)
pyplot.xlabel("% Daily Return")
pyplot.title("Average Daily Returns")
ax = fig.add_subplot(3, 1, 2)
ax.grid(b=False)
ax.barh(y_pos, sharpes, align='center', alpha=0.8)
pyplot.yticks(y_pos, labels)
pyplot.title("Sharpe Ratios")
ax = fig.add_subplot(3, 1, 3)
ax.grid(b=False)
pyplot.barh(drawdown_y_pos, drawdown, align='center', alpha=0.8, color='red')
pyplot.yticks(drawdown_y_pos, drawdown_labels)
pyplot.xlabel("% Drawdown")
pyplot.title("Max Drawdown")
fig.subplots_adjust(wspace=.35, hspace=.6)
So now if we look at the different strategy parameters, we see that
In total, the "All Sector" strategy produced the biggest daily return but the "Higher Range" of 10-20% share strategy provided the best sharpe ratio. Both have one of the top three lowest max drawdowns (~11%)
higher_range.cumulative_performance.ending_portfolio_value.plot(label="All buybacks > 7.5%")
all_sectors.cumulative_performance.ending_portfolio_value.plot(label='All % buybacks')
pyplot.title("Comparing returns of the two best strategies")
pyplot.legend(loc='best')
Higher Range Strategy
provides a lower level of volatility¶But remember that we took only algorithms where the buybacks were between 10-20%. Now what if we say:
Take all the events where the buybacks are greater than 7.5% AND trading across all sectors
"""
This algorithm looks at all sectors but only buybacks that are greater than 7.5%
"""
combined_results = get_backtest('54cfe2ac2149e40d4d542337')
combined_results.cumulative_performance.ending_portfolio_value.plot(label="All buybacks > 7.5%")
higher_range.cumulative_performance.ending_portfolio_value.plot(label='Buybacks between 10-20%')
all_sectors.cumulative_performance.ending_portfolio_value.plot(label='All % buybacks')
pyplot.title("Comparing returns of the best strategies")
pyplot.legend(loc='best')
"""
Here, we're doing the exact same thing that we did before, but putting it all into one place
"""
#: Creating the labels
drawdowns = {}
drawdowns['Buybacks between 10-20%'] = higher_range.risk.max_drawdown.iloc[-1]
drawdowns["All % buybacks"] = all_sectors.risk.max_drawdown.iloc[-1]
drawdowns["All buybacks > 7.5%"] = combined_results.risk.max_drawdown.iloc[-1]
drawdown_labels = sorted(drawdowns.keys(), key=lambda x: drawdowns[x])
drawdown_y_pos = np.arange(len(drawdown_labels))
drawdown = [drawdowns[s]*100 for s in drawdown_labels]
avg_return = {}
avg_return['Buybacks between 10-20%'] = higher_range.daily_performance.returns.mean()
avg_return["All % buybacks"] = all_sectors.daily_performance.returns.mean()
avg_return["All buybacks > 7.5%"] = combined_results.daily_performance.returns.mean()
return_labels = sorted(avg_return.keys(), key=lambda x: avg_return[x])
return_y_pos = np.arange(len(return_labels))
avg_returns = [avg_return[s]*100 for s in return_labels]
sharpe_ratios = {}
sharpe_ratios['Buybacks between 10-20%'] = higher_range.risk.sharpe[-1]
sharpe_ratios["All % buybacks"] = all_sectors.risk.sharpe[-1]
sharpe_ratios["All buybacks > 7.5%"] = combined_results.risk.sharpe[-1]
labels = sorted(sharpe_ratios.keys(), key=lambda x: sharpe_ratios[x])
y_pos = np.arange(len(labels))
sharpes = [sharpe_ratios[s] for s in labels]
#: Creating the subplots
fig = pyplot.figure()
ax = fig.add_subplot(3, 1, 1)
ax.grid(b=False)
ax.barh(return_y_pos, avg_returns, align='center', alpha=0.6, color='green')
pyplot.yticks(return_y_pos, return_labels)
pyplot.xlabel("% Daily Return")
pyplot.title("Average Daily Returns")
ax = fig.add_subplot(3, 1, 2)
ax.grid(b=False)
ax.barh(y_pos, sharpes, align='center', alpha=0.8)
pyplot.yticks(y_pos, labels)
pyplot.title("Sharpe Ratios")
ax = fig.add_subplot(3, 1, 3)
ax.grid(b=False)
pyplot.barh(drawdown_y_pos, drawdown, align='center', alpha=0.8, color='red')
pyplot.yticks(drawdown_y_pos, drawdown_labels)
pyplot.xlabel("% Drawdown")
pyplot.title("Max Drawdown")
fig.subplots_adjust(wspace=.35, hspace=.6)
Voila. By doing some rough parameter optimization, we can estimate a better range of parameters for our strategy. In this case, we found that the 10-20% buybacks on all sectors provided the most returns, so we made a more broader guess to say that all buybacks greater than 7.5% might even do better. By doing that we minimized the drawdown, got the best Sharpe ratio and obtained returns that are almost as good as the more volatile, high return option.
EventVestor provides data feeds for many different corporate events.
Interested in using this data in your own Quantopian algorithm? Contact EventVestor today.