backtest on multiple pairs trading - after cointegration analysis in R

Back to Community

Lionel Wüst

posted

Hi Quantopians !

Here are my results of a pair trading backtesting, on 7 pairs choosen from Dow Jones after cointegration analysis in R.

What do you think about it ?

Have any of you done something similar ?

See you
Lionel

9 responses

Novice TAI

Hi Lione,

thanks your sharing,seems interesting,
could you share how did analysis the cointegration in R?

Rudiger Lippert

One question I'd ask Lionel: did you perform your cointegration analysis using prices during the period of your backtest (i.e. after 2002-01-03)? If that is the case you algorithm has lookahead bias. For algorithms like this to show valid backtest results you either have to 1) perform the cointegration analysis on prices prior to the backtest period or 2) repeat the cointegration analysis inside the algorithm periodically for dates that are prior to the current date of the algorithm.

Deleted User

I have been playing for a month with cointegration and vecm using implementation here http://econ.schreiberlin.de/software/vecmclass.py. Most times cointegration fails out of sample so if you can put in some checks around that a 1 sharpe strategy is feasible in my experience.

Matthieu Lestel

Novice TAI,

I don't know how Lionel did his analysis but you can use the package urca.

Here is a small example:

#for simulation of geometric brownian motion  
require('sde')  
#for cointegration analysis  
require('urca')  
#creation of common stochastic trend (geometric brownian motion)  
GBM <- GBM(x = 100,r = 0.02,sigma = 0.2,T = 1,N = 252)  
#white noise  
u <- rnorm(n = 253,mean = 0,sd = 1)  
v <- rnorm(n = 253,mean = 0,sd = 1)  
#time series of cointegrated prices  
p1 <- GBM + u  
p2 <- GBM + v  
#cointegration analysis  
coRes=ca.jo(data.frame(p1,p2),type="trace",K=2)  
summary(coRes)

Lionel Wüst

Hi,

I did the cointegration analysis by fitting a linear regression model without intercept on every single pair of Dow Jones stocks (stock1_price = beta*stock2_price + error)

Then perform an Augmented Dickey-Fuller test on the spread (the error term) to test stationarity. I used the series package.

Rudiger is right, there is lookahead biais because i used all the available data on the stocks prices. It was just an easy test of Quantopian platform.

We could also assume that if the spread has been stationary for such a long time until today, it will continue to be stationary in the future.

See you
Lionel

Simon Thornington

We could also assume that if the spread has been stationary for such a long time until today, it will continue to be stationary in the future.

Haha would that we could! But seriously, one really must not optimize to test data, it's tempting when messing around and the only goal is a pretty back-test, but eventually (hopefully) you're going to be risking real money, so it's better to never get in the habit of cutting corners.

Rudiger Lippert

A case in point is GDX (gold miners ETF) and GLD (gold ETF). At some point they were cointegrated, but eventually the relationship broke down, because it turns out that GDX also has exposure to oil prices. So, although pairs appear to be stationary for quite some time, the reality can change in the future.

Alan Wang

It is useful as a coding exercise. But never use this to trade directly. Your approach involved two most common problems in research: Look ahead bias and survival bias. If you use more aggressive curve fitting techniques, you can potentially get much better returns. But that means nothing for the real trading.

Lionel Wüst

You're right Alan, it was one.

You've successfully submitted a support ticket.

Our support team will be in touch soon.