Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Denoising and detoning - MLAM, Marcos Lopez de Prado

Hello everyone,

In his books, Marcos mentions a technique that allows us to denoise a correlation matrix. It is based on linear algebra, more particularly eigenvalues and eigenvectors.

I have a correlation matrix of the returns of 500 stocks, and in order to denoise the matrix I have to find the maximum eigenvalue thanks to Marcenko-Pastur's theorem. However, except the largest eigenvalue (around 300) the other eigenvalues computed by Python are very close to 0 and some are negatives. Therefore I can't fit any Marcenko-Pastur's distribution and even if I could, it would eliminate all eigenvalues except the first.

Have you already implemented Marcos' technique of denoising and have it worked ? Or do you know why all my eigenvalues are very close to 0 / negative ?

Thank you and take care,

2 responses

Yes, it works.

There are a lot of things coming into play. How far back are you looking? How are you calculating the correlation matrix?

Having only 1 large eigenvalue during this time period essentially is telling you that there is only 1 risk factor, and that risk factor is most likely the market.

Thank you for your answer Jonathan.

My database goes from 2007-01 to 2020-01 (monthly data), the variations due to the crisis have not showed yet.

Actually I have executed my algorithm on both correlation matrix and variation of information matrix. These matrixes are empirical and I have obtained them thanks to the performance of the 500 stocks in my database. The performance of each stock are not exactly the return, but a fractionally differentiated version of the price in order to make them stationary while preserving the memory.