Sensitivity Analysis a.k.a. "Parameter Optimization" of Pair Trade Input Parameters

Back to Community

posted Jul 23, 2015

Today I'd like to share a notebook in our Research environment that shows how to run backtests of your algo across various input parameters, and then plot the results in various heatmaps to enable visualizing of how sensitive the algo is to small changes in the input parameters.

You can step through the notebook one cell at a time, or just do "Run All..." If you go the route of "Run All..." it may take upwards of 30 minutes for the whole notebook to complete because I have it setup to run 25 total backtests, varying the values across two parameters over five values each.

As well, it is worth mentioning, that your algo must be written in Zipline, completely in the Research environment, in order to do this at present time. The notebook I'm sharing here is a basic implementation of a pair trading algo, and you can freely modify the following inputs:

the 2 stocks in the pair
start and end dates for the backtest
Z-score entry/exit criteria (e.g. +/- 1.0 standard deviation) for which to enter and exit the trades based on how much the pair's spread has diverged
the lookback number of days to use for computing the hedge ratio (e.g. the # of days used in the regression)
the lookback number of days for calculating the Z-score for determining whether the pair is diverging

The parameters over which I'm testing the sensitivity of in this example are:

the lookback number of days to use for computing the hedge ratio (e.g. the # of days used in the regression)
the lookback number of days for calculating the Z-score for determining whether the pair is diverging

By varying each of the inputs and viewing the resulting heatmaps I can see whether basing the spread calculation over shorter or longer timeframes results in more profitable trades (based on days used in the hedge ratio regression calculation); as well, I can see whether I should trigger a trade based on shorter or longer term divergences (based on the days used to compute the Z-score).

Held "constant" in the backtest are the following:

trades are entered when the pair's spread diverges by more than +/- 1.0 standard deviations (Z-scores),
trades are exited when the spread converges to 0.0 Z-scores.

Free free to change the values to whatever you wish if you prefer. As well, the for() loop that runs all of the backtests can be easily modified to run over these entry and exit z-score parameters instead.

After you've run your simulations, over many different pairs of stocks, and encounter encouraging results, you can simply clone the algo I've shared below in the next reply (which is the Q Backtester equivalent of the Zipline algo used in this research notebook), quickly modify the stocks, and parameters to what you've researched, and then papertrade it live or enter it in an upcoming contest.

Happy Researching!

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

15 responses

Deleted User

Jul 23, 2015

Here's the backtester algo for you to clone.

Disclaimer

Tristan Rhodes

Jul 23, 2015

@Justin: This looks great! Thanks for putting this together. I can't wait to dig more into it...

Deleted User

Jul 24, 2015

@Tristan, great, I hope it serves useful to you for your research. Please feel free to leave comments regarding any improvements you would like to see, or if you have any questions on how to modify the code.

One thing I noticed is that I left all of the ticker symbols hardcoded. They only appear in 2 places so it's easy to swap out. But thought I'd mention it.

1) They show up when you pull the data with get_pricing.
2) The are in the initialize() function definitions. E.g.

context.y = symbols('USO')
context.x = symbols('GLD')

Disclaimer

JOHN CHAN

Jul 25, 2015

@Justin... i luv your graphics in your notes..however... it seems... it only works with 'USO' and 'GLD'.. when you try.. other pairs... it wont work... try it with other pairs.. you see what I mean... cheers.

Deleted User

Jul 25, 2015

@JOHN CHAN, see my response directly above your last post about how to modify the ticker symbols.

Ctrl+F on Windows, or Command+F on Mac will work for finding text in the ipython notebook just like on a regular webpage. So you can just Control+F search for the couple of instances of "USO" and "GLD". They are only hardcoded in the 2 places that I referenced in my above previous reply.

Let me know if this works.

Disclaimer

JOHN CHAN

Jul 25, 2015

@justin... I mean this one... in your notes... you just change the symbol right?? it wont work just like USO and GLD illustration....

"""
This cell loads in the data for our tickers used in the backtest.
Change the ticker symbols, start_date or end_date to suit your needs.
""" #uso ,gld data = get_pricing(
['USO', 'GLD'], **<--------------------------------------
start_date='2013-01-01',
end_date = '2015-01-01',
frequency='minute'
)**

Deleted User

Jul 25, 2015

Hi John,
Sorry I guess I'm still not understanding your question. All you should have to do is change those ticker strings everywhere in the notebook (I just looked and there are 3 total places where "USO" shows up, for example.)

These are the cell names that contain the ticker symbol strings: In[2] , In[4], In[9]

Then you just have to re-run all the cells of the notebook (by doing Shift+Enter on each cell). Or just doing "Run All.." from the "Run" dropdown menu in the upper-right of the screen.

The heatmap graphics will only update if you first run all the simulations, and then you have to execute each of the cells after that draws the graphics. "Run All" will automatically do this, but if you are running each cell individually using Shift+Enter then you will have to run all of these cells as well.

I've just tried it on my end by changing the tickers in each of those cells to something different, then I re-ran the whole notebook, and it is working for me.

Let me know if I am not understanding the issue you're seeing correctly, and I can try to help further.

Disclaimer

Bob Edwards

Jul 28, 2015

Hi, I'm trying to run this in the research module, and I cannot seem to get it to work. I am just copying and pasting the cells. Any advice?

Thanks.

Anthony Ng

Jul 30, 2015

Hi Justin. The code is running well on my research platform. However, it is extremely slow.

When it get to this part

Running the cell below runs a single backtest, to serve as an example before running all 25 backtests later on

# RUN this cell to run a single backtest  
algo_obj = TradingAlgorithm(initialize=initialize, handle_data=handle_data,  
                            data_frequency='minute')  
perf_manual = algo_obj.run(data.transpose(2,1,0))  
perf_returns = perf_manual.returns     # grab the daily returns from the algo backtest  
(np.cumprod(1+perf_returns)).plot()    # plots the performance of your algo

It took almost 2 hours to complete this. Is that normal? This is run on your server right?

Steven Denison

Jul 30, 2015

Can someone explain why I would get the following error:

# RUN this cell to run a single backtest  
algo_obj = TradingAlgorithm(initialize=initialize, handle_data=handle_data,  
                            data_frequency='minute')  
perf_manual = algo_obj.run(data.transpose(2,1,0))  
perf_returns = perf_manual.returns     # grab the daily returns from the algo backtest  
(np.cumprod(1+perf_returns)).plot()    # plots the performance of your algo
---------------------------------------------------------------------------  
TypeError                                 Traceback (most recent call last)  
<ipython-input-7-2516be5a4125> in <module>()  
      2 algo_obj = TradingAlgorithm(initialize=initialize, handle_data=handle_data,  
      3                             data_frequency='minute')  
----> 4 perf_manual = algo_obj.run(data.transpose(2,1,0))  
      5 perf_returns = perf_manual.returns     # grab the daily returns from the algo backtest  
      6 (np.cumprod(1+perf_returns)).plot()    # plots the performance of your algo

TypeError: transpose() takes exactly 1 argument (4 given)

Umar Hasan

Oct 2, 2015

Hi - in "param_range_1 = map(int, np.linspace(20, 100, 5)) # hedge ratio lookback", if I want to use decimal places for instance to test over a range of 4 - 6 i.e. (np.linspace(4, 6, 5)) - what do I replace int with?

I tried = map(Decimal, (np.linspace(4, 6, 5)) but no luck and figured I probably need to import decimals so than I added at top 'from decimal import decimal' and am now getting an error msg:
"InputRejected: Importing decimal from decimal raised an ImportError. No modules or attributes with a similar name were found. Our security system is concerned. If you continue to have import errors, your account will be suspended until a human can talk to you." What am I missing?

edit: ok solved by [weight for weight in np.arange(4, 8, .5)] and making a separate weight variable

Bodo Walter

Apr 14, 2016

Dear all,
there seem to be an error in the line

perf_manual = algo_obj.run(data.transpose(2,1,0))

how can this been solves?
any help is greatly appreciated.
Regards

Deleted User

Apr 15, 2016

@Bodo,
What's the error you see? Can you paste it in?

Disclaimer

Ryan M

Jul 27, 2016

@Justin

I updated the date rage to avoid one bug to not start or end on the first:

data = get_pricing(
['USO', 'GLD'],
start_date='2013-01-02',
end_date = '2015-01-02',
frequency='minute'
)

I'm still getting an error here.

KeyError Traceback (most recent call last)
in ()
2 algo_obj = TradingAlgorithm(initialize=initialize, handle_data=handle_data,
3 data_frequency='minute')
----> 4 perf_manual = algo_obj.run(data.transpose(2,1,0))
5 perf_returns = perf_manual.returns # grab the daily returns from the algo backtest
6 (np.cumprod(1+perf_returns)).plot() # plots the performance of your algo

/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.pyc in error() 1271 "cannot use label indexing with a null key")
1272 raise KeyError("the label [%s] is not in the [%s]" %
-> 1273 (key, self.obj._get_axis_name(axis)))
1274
1275 try:

KeyError: 'the label [2013-01-02 14:31:00+00:00] is not in the [index]'

I think it's related to using minute data vs daily data.

Nathan Wolfe

Aug 2, 2016

@Ryan: Currently Zipline can't do minute backtests in Research. This should be fixed soon.

Disclaimer

You've successfully submitted a support ticket.

Our support team will be in touch soon.