Notebook

Begin by importing some needed modules

In [105]:
import pandas as pd
from datetime import datetime, timedelta
import matplotlib.pyplot as pyplot
from zipline import TradingAlgorithm
from zipline.api import order_target_percent, symbol, get_datetime, get_order, sid, schedule_function, date_rules

Load index change data from EventVestor, format data, and create helper functions

In [42]:
index_changes = local_csv('index_inclusion.csv')
In [135]:
sp_500_changes = index_changes[index_changes["indexName"] == "S&P 500"]
In [44]:
def convert_to_datetime(date):
    return pd.to_datetime(str(date), utc=True)
In [192]:
def compare(results, title):

    pyplot.plot(results["benchmark_period_return"].index,
                results["benchmark_period_return"] * 100.0,
                label="SPY (S&P 500)", color="silver")
    
    pyplot.plot(results["algorithm_period_return"].index,
            results["algorithm_period_return"] * 100.0,
            label="My Front Running Strategy", color="#c50000")

    ax = pyplot.axes()     
    ax.xaxis.grid()   
    ax.yaxis.grid(color="#DFDFDF")

    pyplot.legend(loc='best', prop={'size':13})
    pyplot.title(title, fontsize=20)
    pyplot.ylabel("% Return", fontsize=14)
In [ ]:
sp_500_changes["tradeDate"] = sp_500_changes["tradeDate"].apply(convert_to_datetime)

What is the strategy?

I'll use a simple strategy where we buy a company's stock the day it's addition to the S&P 500 is announced and sell it 5 days later.

Why 5 days?

>According to the S&P Indices Methodology Guide, "Constituent changes are typically announced one to five days before they are scheduled to be implemented." So I'll use 5 days to be safe, though the optimal implentation would be to sell exactly when all the funds are buying.

Balancing?

>Sometimes multiple additions will be announced on the same day. In that case we will buy an equal amount of each security where the summed value equals the portfolio's total value. Note: This can lead to a leveraged account if there are multiple days that additions are made within 5 days of each other. The max leverage observed in the backtests below was 2.5 and occurred ~3 times per year.

In [156]:
def run_strategy(data, changes):
    
    def initialize(context):
        context.index_changes = changes
        context.orders = []
        schedule_function(trade, date_rule=date_rules.every_day())
    
    
    def handle_data(context, data):
        pass
    
    
    def trade(context, data):
            
        today = get_datetime().replace(hour=0, minute=0, second=0, microsecond=0)
        todays_events = context.index_changes[context.index_changes["tradeDate"] == today]
                    
        to_sell = []
        # Check if any orders are over 5 days old
        for order_id in context.orders:
            this_order = get_order(order_id)
            if this_order and get_datetime() - this_order.created > timedelta(days=5):
                to_sell.append(this_order.sid)
                context.orders.remove(order_id)                
        
                        
        if len(todays_events) > 0: 
            to_buy = []         
            # See if there are any additions to the S&P 500
            for row in todays_events.iterrows():
                event_type = row[1][-1]
                try:
                    security = symbol(row[1][-5])
                    if event_type == "Addition" and security.sid not in context.portfolio.positions.keys():
                        to_buy.append(security)
                except:
                    continue
                        
            if len(to_buy) > 0:                 
                # Buy new stocks
                for security in to_buy:
                    if security in data:
                        order_id = order_target_percent(security, 1.0 / len(to_buy))
                        context.orders.append(order_id)
                    else:
                        print("%s not in data" % security.symbol)
                    
        if len(to_sell) > 0:
            # Sell old stocks
            for security in to_sell:
                if security in data:
                    order_target_percent(security, 0)
                else:
                    print("%s not in data" % security.symbol)
    
    
    my_algo = TradingAlgorithm(
        initialize=initialize, 
        handle_data=handle_data,    
        data_frequency="minute"
    )    
    
    results = my_algo.run(data)
    return results

Let's try it out with a backtest!

I only have data from 2007 to present so I'll select a sample period from both ends and during the financial crisis. It will be interesting to see how effective the strategy is during times of hardship. In addition we have seen a large growth in index fund investing in that time period. For example look at Vanguard Group, one of the largest index fund issuers in the world. By looking at this graph of their Assets Under Management (AUM) we can see that their fastest growth starts a little after 2006. How will this strategy evolve as indexing becomes more popular?

2007 - 2008 | Performance

In [47]:
sp_500_changes_2007_2008 = sp_500_changes[sp_500_changes["tradeDate"] > pd.to_datetime("2007-7-1", utc=True)]   
sp_500_changes_2007_2008 = sp_500_changes_2007_2008[sp_500_changes_2007_2008["tradeDate"] < pd.to_datetime("2008-7-1", utc=True)]   
In [48]:
data_2007_2008 = get_pricing(
                        sp_500_changes_2007_2008.tickerSymbol.unique(),     
                        start_date=pd.to_datetime("2007-7-1", utc=True), 
                        end_date=pd.to_datetime("2008-7-1", utc=True),
                        fields='price',
                        handle_missing='ignore',
                        frequency="minute",
                        ).ffill()
In [157]:
results_2007_2008 = run_strategy(data_2007_2008, sp_500_changes_2007_2008)
In [193]:
compare(results_2007_2008, "2007 - 2008")

2008 - 2009 | Financial Crisis

While the above results are clearly atypical, there does seem to be a general upward trend. How does this strategy fair during times of economic stress?

In [96]:
sp_500_changes_2008_2009 = sp_500_changes[sp_500_changes["tradeDate"] > pd.to_datetime("2008-7-1", utc=True)]   
sp_500_changes_2008_2009 = sp_500_changes_2008_2009[sp_500_changes_2008_2009["tradeDate"] < pd.to_datetime("2009-7-1", utc=True)] 
In [97]:
data_2008_2009 = get_pricing(
                        sp_500_changes_2008_2009.tickerSymbol.unique(),     
                        start_date=pd.to_datetime("2008-7-1", utc=True), 
                        end_date=pd.to_datetime("2009-7-1", utc=True),
                        fields='price',
                        handle_missing='ignore',
                        frequency="minute",
                        ).ffill()
In [159]:
results_2008_2009 = run_strategy(data_2008_2009, sp_500_changes_2008_2009)
In [194]:
compare(results_2008_2009, "Financial Crisis")

2014 - 2015 | Does the strategy still work today?

The above results show only a marginal gain but plenty of volatility. This was an rough time for the market though, a good extension would be to run a separate study on how the general instablity of the US stock market affected the results. As index investing and index front running has grown, will the strategy still yield abnormally high returns?

In [55]:
sp_500_changes_2014_2015 = sp_500_changes[sp_500_changes["tradeDate"] > pd.to_datetime("2014-7-1", utc=True)]   
sp_500_changes_2014_2015 = sp_500_changes_2014_2015[sp_500_changes_2014_2015["tradeDate"] < pd.to_datetime("2015-7-1", utc=True)]
In [56]:
data_2014_2015 = get_pricing(
                        sp_500_changes_2014_2015.tickerSymbol.unique(),     
                        start_date=pd.to_datetime("2014-7-1", utc=True), 
                        end_date=pd.to_datetime("2015-7-1", utc=True),
                        fields='price',
                        handle_missing='ignore',
                        frequency="minute",
                        ).ffill()
In [161]:
results_2014_2015 = run_strategy(data_2014_2015, sp_500_changes_2014_2015)
In [195]:
compare(results_2014_2015, "Recent Performance 2014-2015")

Final thoughts...

The strategy to front running index funds is sound and will certainly make money if implemented optimally. I'm excited to see how this phenomena changes as more investors look to invest in index funds and as more inverstors look to game them.

Will the efficacy of front running index funds decline as more investors take advantage?

How will index funds mitigate the losses incurred by front runners?

Questions? | jhall@quantopian.com

Don't have a Quantopian account? Join now and start coding!