Using PCA in Quantitative Finance

Jamie Veitch (Azure Hawk)

Jan 19, 2018

Hi Maxwell -

I recently discovered that for certain types of problems, one should not use ordinary least squares, but rather some form of orthogonal regression to avoid a bias called regression dilution (attenuation). I'm wondering if when calculating beta, for example, a form of orthogonal least squares should be used to avoid the bias.

For example, perhaps the first principal component of returns (e.g. of XYZ versus SPY returns) to compute beta would be more appropriate, avoiding the regression dilution bias?

Jan 25, 2018

Seriously? You need to pick your target market better...

Blue Seahawk

Jan 25, 2018

The other target was the journal of participatory medicine. Should be going for students.

1 result (0.42 seconds)
Search Results
Reader Comments - Journal of Participatory Medicine
ojs.jopm.org/index.php/jpm/comment/view/31/0

Quick Essay Help. by quick essay help (2017-12-04). Email Reply. Of course, nobody wants to get low grades in assignments,especially when the marks of the write up will be included in the final exam marks. But, with the advancement of technology, students with the help of... Read more ...

Maxwell Margenot

Jan 26, 2018

@Grant, OLS certainly has its its limitations. I guess that an analogue in finance to the measurement errors that cause regression dilution would likely be when we get incorrect pricing or other data from data providers.

It sounds like you're talking about Principal Component Regression, which is a pretty cool technique that has other useful characteristics. I wouldn't necessarily use something like this to compute beta-to-SPY, however, as it would technically not be beta. For computing risk factor exposures in general, PCA can absolutely help us out. Creating statistical risk factors after you have accounted for known risk factors helps you to further cover your bases.

Disclaimer

Jan 26, 2018

@ Maxwell -

I’ll have to put together a notebook to illustrate. I could be off base, but the standard beta calculation is the slope of a plot of the daily returns of XYZ vs. SPY. If one assumes this as a scatter plot of two independent measures, then the first principle component represents a vector that is the direction of greatest variation. It can be used as a model to relate changes in XYZ to changes in SPY. It would seem to be a valid model beta and not suffer from the pitfall of regression dilution characteristic of OLS.

https://shankarmsy.github.io/posts/pca-vs-lr.html

Jan 31, 2018

A relevant link:

I have been trying PCA based models for over 2+ years on Quantopian. The fundamental problem is that if you fit PCA today and fit PCA tomorrow on same set of stocks, the factors vary a lot and so do the betas of PCR.

Has anyone found a solution to beta and factor stability? For now, I am running model once in X days and storing it (but that has issues with pipeline data changing often). Another alternative is functional PCA but I am not ready yet with it.

My intuitive 2 cents...

Regarding beta stability, it has been observed that the Optimize API is not very effective in controlling beta to SPY. For example, we've shown on the forum that for a trailing beta computed with SimpleBeta (which uses OLS, not PCA), and a beta constraint of +/-0.05, the resulting beta after a 2-year backtest can be considerably higher: beta ~ 0.3 in some cases. The suggestion is that forecasting beta to SPY with a simple trailing window regression is not so easy (assuming that there isn't a bug in the Quantopian implementation).

I guess it is not surprising that multidimensional regressions using PCA would show a similar problem with stability, since the underlying assumption is that there is no temporal component. The assumption is that the trailing window of data (e.g. 1-year) was statistically stable over the window, and will maintain its statistical characteristics going forward.

In the context of market efficiency and public information, it seems reasonable to think that it would be hard to find stable and profitable market attributes; any naive approach will fail, since all market participants have access to simple tools, and have already applied them to the widely available information, eliminating any advantage (if there ever was one in the first place).

The solution, perhaps, is to use dynamic models that are explicitly formulated to forecast time series that are not very statistically stable. Usually, dynamic models of complex phenomena (e.g. weather, fluid flow, etc.) are much more computationally challenging and expensive (and in science/engineering, one has the benefit of some set of equations based on the underlying physics). In other words, trying to gain an edge with widely available information will require something more than run-of-the-mill algorithms.

Luca

@Grant -

I like your 2 cents, especially this statement:

In the context of market efficiency and public information, it seems
reasonable to think that it would be hard to find stable and
profitable market attributes; any naive approach will fail, since all
market participants have access to simple tools, and have already
applied them to the widely available information, eliminating any
advantage (if there ever was one in the first place).

Also I often find myself asking the question: is there any alpha left at all in the data available? Isn't that possible that the only viable alpha is in data not publicly available (or in very expensive data)? Or, given the increase of automated trading systems, isn't that possible the alpha is becoming more and more transient so that trading strategies have to constantly increase their trading frequency to be able to seize the alpha? You believe the alpha is actually in the data and we only need to build more sophisticated and dynamic methods to be able to detect it. That's reasonable and eventually I need to look for and read some papers to find out more about this topic.

Nota bene: I don't have a clue. It would be interesting to hear from Maxwell. Since this is a thread on PCA-based tools, are there case studies where someone has made money using them? Or maybe you talked with Jonathan Larkin or someone else who has traded for a living, and could provide some insight. No doubt PCA finds applicability, but I'm guessing it's not like "PCA...kaboom...manna from heaven!"

James Villa

And here's my penny sense....

In search of finding beta stability through forecasting individual stock beta to SPY by regression, PCA or other techniques, a more efficient approach is to forecast SPY movements alone then drill down to the individual stocks. We already know what the general relationship is between the stock universe and the market, SPY, so if you can find a way to accurately forecast SPY, then controlling individual stock beta should follow. This is how we do it with our market timing models, we try to predict with some degree of accuracy what SPY will do in the near future and based on that, do portfolio construction on individual stocks since we already know the historical relationship between individual stocks and SPY.

The caveat to all this is in trying to achieve total beta neutrality, you arrive at risk free rates returns which means zero alpha. Only reason you get some alpha, in context of beta, is the wiggle room of +-0.3. Hope this helps explain the phenomena.

Maxwell Margenot

For more stable PCA, check out Robust PCA. The paper is pretty good and the method is better for handling outliers and strange perturbations.

Disclaimer

Guy Fleury

Feb 2, 2018

@Maxwell, read the mentioned paper.

I had to regrettably come to the conclusion that there was nothing there that could be of any use in trading. None of it could be applied successfully to any kind of worthwhile trading strategy.

It was interesting for image processing, but nonetheless, not of any practical use in trading. Anyone venturing on that route should be prepared to waste a lot of time which will end with absolutely nothing to show for it.

So, I wish well to anyone who dares try. They have my admiration for pursuing such an adventuress path.

@Karl here is the revised notebook. This is more accurate.

There are several prospective improvements:

Replace PCA with SparsePCA
Replace LinearRegression with other regularized methods.

Hi Pravin -

I'm an intellectually lazy technical layman (actually, I make a living at this stuff, but statistics/data science/finance are not my strong suit)--what is the executive summary of your work above, in simple terms? The paper you provided is a hefty 47 pages, and your notebook, while I'm sure technically valid, has a lot more code than explanatory prose. I'm not a Python expert and the techniques are unfamiliar--what are you showing and why would one think it is valid?

Best regards,

Grant

@ Maxwell -

I'd be interested in your take on the risk model and its relationship to dimensionality reduction. If one looks at the risk model factors, it seems that they are rather arbitrary, and may not explain the actual sources of risk. We have a long list of industry sectors, which are no doubt highly correlated, and might not explain anything. The style risk factors, I gather, are as old as the hills, and probably explain very little, as well.

I'm still getting up the learning curve on this dimensionality reduction stuff, but I'm wondering if it could be applied to the Q risk model, to understand if it makes sense, or if it is just adding noise?

@Grant Here is a summary of both papers:

First paper identifies the systematic risk factors that drive the market (PCA factors) and regresses returns of each security against these factors to find a hedge portfolio. The cumulative sum of residuals should be mean reverting (we test for them and select only those). We assume that if the factors explain the returns and if the regression is stable then the portfolio of security and factors are mean reverting.
The second paper contains a technique to improve mean reversion.

Hope that helps.

Thanks Pravin - I should probably roll up my sleeves and try to understand those papers.

James Villa

@Grant, Pravin,

The approach that Pravin uses in his example is somewhat similar to what I described in my above post. It is one form of statistical arbitrage using PCA as a dimension reduction procedure to determine x principal components that can explain what drives the market (SPY, in this case) based on some lookback period of historical returns. It is then drilled down to individual stocks by regressing against these factors and Pravin explains the rest of the procedure. Stat Arb originated from pairs trading, in this example, the pairing is between SPY and individual stocks. There are many other Stat Arb approaches from simple distance measure to more complex ones, like cointegration or copulas.

https://arxiv.org/abs/1306.6291

Feb 4, 2018

It seems that for this PCA stuff to work, the factors need to be stable. As I think we're seeing for even a simple beta-to-SPY, it is not so easy. One suggestion would be to use beta-to-SPY as a simple test case, to sort out how to forecast the beta of a stock N days forward. It would seem that if this is hard, then forecasting more subtle relationships would be even harder, but maybe I'm thinking about this incorrectly. In other words, we know a priori that SPY is a dominant factor, so why not start with it, before getting fancy?

Here's a relevant reference for low-dimension PCA (2x2 or 3x3 covariance matrix) :

A Method for Fast Diagonalization of a 2x2 or 3x3 Real Symmetric Matrix
M.J. Kronenburg
(Submitted on 26 Jun 2013 (v1), last revised 16 Feb 2015 (this version, v4))

A method is presented for fast diagonalization of a 2x2 or 3x3 real symmetric matrix, that is determination of its eigenvalues and eigenvectors. The Euler angles of the eigenvectors are computed. A small computer algebra program is used to compute some of the identities, and a C++ program for testing the formulas has been uploaded to arXiv.

One crude way, I resolve the factor instability is to fit factors and regression only once in X days. I reuse the models for the next X days.

A more rigorous approach is here: https://arxiv.org/pdf/1001.2363.pdf
Python code here: https://github.com/dfm/pcp

The L matrix are your PCs.

Guy Fleury

@Pravin, the article you relate to is a resumé of the one @Maxwell cited. It does not lead anywhere as well.

Will you be able to detect from an nxm matrix its sparse outliers? I would say: easy, even just looking at it.
However, say you have a 1,000x500 stock price variation matrix: ΔP, and you want to find its sparse outliers. You could request something like: ΔP > mean ± ½ σ, or ΔP > mean ± 2 σ, your choice. The higher you will set the threshold, the sparser the S_0 matrix will become. Note that ΔP would have 500,000 data points. Even if you put 5,000 data points above 3σ, they will still be easy to detect. But, that is not the problem. Well, not the one worth something anyway.

The question is: will it improve anything knowing that: ΔP – S_0 = L_0 ? Or, will you not be looking at about the same almost randomly generated price matrix? And going forward, all the information you have accumulated over that ΔP matrix will not help you at the right hedge of the chart where you have to make your trading decisions on those 500 stocks.

Stock prices do not have sparse low density noise that dances randomly over a stable high level backdrop signal. On the contrary, whatever signal you might have is buried in high density noise to such an extent that the noise Z_0 itself is much greater than L_0.

If the authors had shown a PCP with high density noise, what their picture would have looked like would have just been static.

It makes the application of PCP strictly inapplicable to extracting useful predictive information from moving stock prices. Other methods than that would be required.

My two cents.

@Guy thanks. It looks like what you are saying holds true. I found this paper that talks about this in financial risk domain. Yet to follow everything the paper says but could be worth the attempt: http://cdar.berkeley.edu/wp-content/uploads/2016/09/risk_seminar_slides_041216.pdf

Guy Fleury