Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Statistical arbitrage

My first attempt at statistical arbitrage as explained in this paper: http://www.math.nyu.edu/faculty/avellane/AvellanedaLeeStatArb071108.pdf

I think I am having trouble fitting the OU process. Any experts here who can guide me?

9 responses

Hey Pavy,

The paper here outlines maxmium likelihood estimation for an OU process and should help you in calibration. You will probably want to use the scipy.optimize or cvxopt (if the problem in convex) libraries in maximizing the likelihood function.

Cheers,
Ryan

Following up, it looks like the paper you listed uses a linear regression approach rather than the MLE approach. The article here may give more insight into the regression approach. I see you defining a function to compute the regression in line 119 and calling it in line 108 to fit some of the OU parameters, but don't see you fit the speed of mean reversion kappa explicitly. Do you include this in the code?

Thanks for your feedback Ryan. I don't see mention of fitting the speed of mean reversion kappa anywhere? Isn't it just a calculation after fitting OU? Please let me know if I failed to understand.

Hey Pavy, you are not exactly "fitting" the OU process directly with the regression approach, but inferring the parameters of the process from the regression. I am used to kappa as a notation, but the paper you use may be different. Here, the speed of mean-reversion is denoted by lambda, while with Wikipedia it is denoted by theta. I am not as familiar with OU in the context of statistical arbitrage as with interest rate models and derivatives pricing, but the mean reversion parameter tells you the speed at which a process (here, the stock price), will converge to a long-term mean after a deviation. The PCA approach outlined in the paper does not have an explicit temporal component, while OU does.

Pavy, this link might help you with OU process. There's Matlab code in there which does it (CointPairsTrade.m).

http://tradingwithmatlab.blogspot.com/2009/12/pairs-trading-cointegration-testing.html

Basically there's two ways to estimate OU process: Max Likelihood and OLS. For the standard process both have closed solutions as mentioned in the page linked by Ryan.

I've implemented Avelaneda's model in the past and wasn't too impressed with the results. I think full-on cointegration approaches tend to work better.

Good luck!

Alexandre, thanks for the resources. What do mean by full-on cointegration vs Avelaneda's model? The OU process is a representation of a cointegration process, so I;m not sure I see the distinction. Cointegration/pairs trading is a well-known signal, and I would expect the market to be relatively efficient with respect to it. I would want to consider a few extensions to the pairs trade; one would be to determine under what conditions (regimes, fundamentals of stocks, market distress) a coitegration relationship breaks down and prices continue diverge - careful, this can result in big losses! Another would be be to consider cointegration between larger baskets (3+ assets) of stock rather than pairs. I would be interested to see these topics explored further on Quantopian!

Avellaneda uses OU process, which models a stationary process. However, this does not imply that the residuals being modeled will converge quickly or be tradable in a profitable way. When I mean cointegration I mean either pairs of stocks of baskets of stocks. With the number of ETFs now available there are many possibilities for finding cointegrated baskets. But cointegration many times breaks down over time, so it's a good idea to include a rule for cointegration stabiliy over time.

How to use it, I am new, I hae no idea abou how it works and how to use it? Would anyone explain? Thank you.

Hey Jim, I would suggest reading the Avellaneda paper, stepping through the code written by Pavy and Alexandre, and trying to replicate the results. For more of a background on financial econometrics, I would suggest this book.