Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
How to get started in creating a simple Python Arbitrage algorithm

Hi all,

I am new here at Quantopian and to creating and backtesting algorithms. I was wondering how to get actually get started in creating a price arbitrage trading algorithm. Any help at all would be great!

Thanks,
Justin

9 responses

Welcome Justin,

On http://blog.quantopian.com/ you'll find links to various algorithms posted to the forum (see the most recent post, and the Feb. 10 one).

When you say "a price arbitrage trading algorithm" what do you have in mind?

Grant

Thank you for the info. And what I mean by arbitrage is trading that profits by exploiting the price differences of identical or similar financial instruments. Looking for market inefficiencies to profit off of.

Hi Justin,

For another recent example of a pairs arb algo check out this post on Ernie Chan's EWA/EWC pair trade with Kalman filter.

Best,
Jess

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi Justin,
to select pair of stocks good for pair trading you can have a look to http://www.pairslog.com (registration required but it is free). They make a screening the in my opinion is good.
To backtest the pairs suggested by pairslog using their same rules I wrote a simple script based on zipline (the open source market
simulator developed by quantopian): I will post the code here in the next few days (have to clean and comment the code before posting).

Thank you all for the resources! I will be taking a look at all of this and come back if I have any questions. Thanks again!

Here I attach the zipline code that I use to backtest the pairs that I take from pairslog.com
I'm using zipline because I do not like the Quantopian IDE (lack of editing facilities, and the logging
slow and limited makes debugging a painful experience...).
I tried to align to the testing rules used by pairslog (0 commission and slippage, EOD data, 14 days
for mean an std), however I was not able to reproduce their results in term of return (my returns are lower).
Probably there is some factor that I'm not taking into account... any suggestions are welcome!

# This is the zipline script that I use to verify the performances  
# of the pairs suggested by pairslog.com (note: obviuosly  
# the results published by pairslog.com are correct and do not  
# need verification, but I like to be sure that I can reproduce  
# exactly the conditions under which the measurements were made)  
# Pairs trading is therefore performed according to the following  
# pairslog.com rules:  
# - the spread is simply the ratio between the prices of the stocks  
# - the zscore is what pairslog calls delta  
# - the window_lenght is set to 14 days  
# - the slippage is set to 0  
# - the commissions are set to 0  
# - the spread is bought when zscore (i.e.delta, in pairslog language) is <= -2  
#   and sold when zscore is >= 2. The position is closed when zscore reverts to 0.  
#  
import matplotlib.pyplot as plt  
import numpy as np  
import pandas as pd  
import statsmodels.api as sm  
from datetime import datetime  
import pytz  
from logbook import Logger

from zipline.algorithm import TradingAlgorithm  
from zipline.transforms import batch_transform  
from zipline.utils.factory import load_from_yahoo  
from zipline.finance import commission,slippage

class Pairtrade(TradingAlgorithm):  
    def initialize(self,stockA,stockB, window_length=14):  
        self.spreads = []  
        self.capital_base=10000  
        self.invested = 0  
        self.window_length = window_length  
        self.instant_fill=True                    #Pairslog results are built using EOD data. (I assumed same day of signal)  
        self.stockA=stockA  
        self.stockB=stockB  
        self.posSizeA=self.capital_base  
        self.posSizeB=self.capital_base           #I assumed 50% margin for both long and short trades  
        self.set_commission(commission.PerTrade(cost=0))        #Pairslog results do not consider commissions.  
        self.set_slippage(slippage.FixedSlippage(spread=0.0))   #Pairslog results are built using EOD data and do not consider liquidity factor.  
        self.txnumber=0  
        self.trades = pd.DataFrame()

    def handle_data(self, data):  
        zscore = self.compute_zscore(data)  
        if (len(self.spreads) < self.window_length):  
            return  
        self.record(zscores=zscore)  
        self.place_orders(data, zscore)

    def compute_zscore(self, data):  
        spread = data[self.stockA].price / data[self.stockB].price  
        self.spreads.append(spread)  
        spread_wind = self.spreads[-self.window_length:]  
        zscore = (spread - np.mean(spread_wind)) / np.std(spread_wind)  
        return zscore

    def place_orders(self, data, zscore):  
        """Buy spread if zscore is <= -2, sell if zscore >= 2,  
           close the trade when zscore crosses 0  
        """  
        #log.info(str(self.get_datetime())+' amount: '+str(self.portfolio.positions[self.stockA].amount))  
        if zscore >= 2.0 and self.invested==0:  
            self.order(self.stockA, -int(self.posSizeA/ data[self.stockA].price))  
            self.order(self.stockB, int(self.posSizeB / data[self.stockB].price))  
            self.invested = 1  
            #log.info(str(self.get_datetime())+' amountS: '+str(self.portfolio.positions[self.stockA].amount))  
        elif zscore <= -2.0 and self.invested==0:  
            self.order(self.stockA, int(self.posSizeA/ data[self.stockA].price))  
            self.order(self.stockB, -int(self.posSizeB / data[self.stockB].price))  
            self.invested = 2  
            #log.info(str(self.get_datetime())+' amountB: '+str(self.portfolio.positions[self.stockA].amount))  
        elif (zscore <= 0 and self.invested==1) or (zscore >= 0 and self.invested==2):  
            self.sell_spread()  
            self.invested = 0  


    def sell_spread(self):  
        """  
        decrease exposure, regardless of posstockB_amountition long/short.  
        buy for a short position, sell for a long.  
        """  
        self.txnumber=self.txnumber+1  
        stockB_amount = self.portfolio.positions[self.stockB].amount  
        self.order(self.stockB, -1 * stockB_amount)  
        #log.info(str(self.get_datetime())+' '+str(stockB_amount))  
        stockA_amount = self.portfolio.positions[self.stockA].amount  
        self.order(self.stockA, -1 * stockA_amount)

if __name__ == '__main__':  
    log = Logger('')  
    start = datetime(2011, 3, 28, 0, 0, 0, 0, pytz.utc)  
    end = datetime(2014, 3, 28, 0, 0, 0, 0, pytz.utc)  
    #stockA='gaz'  
    #stockB='uco'  
    #stockA='mzz'  
    #stockB='xop'  
    stockA='IVV'  
    stockB='SPY'  
    data = load_from_yahoo(stocks=[stockA, stockB], indexes={},  
                           start=start, end=end)  
    pairtrade = Pairtrade(stockA,stockB)  
    results = pairtrade.run(data)  
    #data['spreads'] = np.nan

    ax1 = plt.subplot(411)  
    data[[stockA,stockB]].plot(ax=ax1)  
    plt.ylabel('price')  
    plt.setp(ax1.get_xticklabels(), visible=False)

    ax2 = plt.subplot(412, sharex=ax1)  
    results.zscores.plot(ax=ax2, color='r')  
    plt.ylabel('zscore')  
    ax3 = plt.subplot(413, sharex=ax1)  
    pd.TimeSeries(np.array(pairtrade.spreads),results.zscores.index).plot(ax=ax3, color='b')  
    plt.ylabel('spread')  
    ax4 = plt.subplot(414, sharex=ax1)  
    results['portfolio_value'].plot(ax=ax4, color='b')  
    plt.ylabel('portfolio value')

    plt.gcf().set_size_inches(18, 8)  
    plt.show()  
    print(results['portfolio_value'])  
    print('Number of trades:'+str(pairtrade.txnumber))  

Code looks nice. Will definitely take a look at it when I get a chance! Thank you.

Leo this is awesome - how do I get your module to print the trades and trade dates?