Using PCA for Statistical Factors Regression

Back to Community

edited Apr 19, 2018

I read Ernest Chan's "Machine Trading", and in his chapter on Factor Analysis, he introduced the idea of using Principal Component Analysis (PCA) to get the statistical factors and then regressing them against next day's returns to get buy/sell signals.

I translated his MatLab code into Python as best as I could, but the backtesting results so far have been dismal, compared to the results in his book.

Was wondering if anyone has tried something similar? Or is there something clearly wrong with my code?

Appreciate any comments or help from you guys.

Thanks!
Yi Peng

10 responses

Luca

Apr 20, 2018

I would suggest to initially run your backtests with commissions and slippage set to 0, just to see if the strategy has some alpha. With the new splippage model I saw good algorithms (they used to be good at least) perform poorly, so watch out for that.

Grant Kiehne

Apr 20, 2018

Here's the algo (Backtest ID: 5ad9230ba7eb4e43d5833bac) with:

    set_commission(commission.PerShare(cost=0, min_trade_cost=0))  
    set_slippage(slippage.FixedSlippage(spread=0))

Grant Kiehne

Apr 20, 2018

Would you be willing to share the new code? What did you fix?

luc prieur

Feb 13, 2019

This is my implementation of Ernest Chan's Statistical Factor loadings algo. In addition, I have added a couple of ways to trade the OLS results. They are commented out in the code. Also, I have implemented a reduction of features from 10 to 5 using Sklearn RFE. This seems to work pretty well.

Now, if someone wants to contribute, please try to fix the high turnover rate.

luc prieur

Feb 13, 2019

Here is pretty much the same as the above but without RFE feature selection. From the resulting OLS, we pick the top stock with best OLS score and split into top/bottom to create the long-short.

Blue Seahawk

Feb 13, 2019

Not to address turnover although it might help there, I didn't check.
Just offering some extras/options and an occasional use of weighted by score can be informative.

Guy Fleury

Feb 13, 2019

It does not matter how much alpha you get in a backtest if a trading strategy cannot at least survive its frictional costs.

The first of any acid test that should be done on any trading strategy is to find out if it could survive these frictional costs (commissions, slippage, and other fees). The second test might be to see if it might break down going forward (giving it more time). And a third test to figure out if it is scalable (give it more money to manage).

All 3 tests were done simultaneously in the attached algo using Luc's version (Backtest ID: 5c62b8dee310ed49b6a0c97e).

Have not read the program, however, I have no motivation to go any further.

luc prieur

Feb 13, 2019

@Guy, Thank you for your contribution. Yes, it was obvious to me that an algo that has a turnover of 100%+ will fail given standard slip of 5bps. Hence why I posted my results asking the community for ideas on how to reduce the turnover, if at all possible.

Thanks @Blue for posting your code. Your "norm()" function has decreased TO to 18%. I have yet to understand why. I will check out carefully your code.

Is there any filter on the universe that could be applied such that it is reduced, and as such, reduce the TO?

/Luc