Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Trading with K-Means and LASSO Regression

my first trading algo

This strategy dynamically chooses the top performing stocks (by Sharpe ratio) of each cluster, then uses an L1 regularization term (LASSO) to penalize the portfolio weights and achieve an all-long portfolio, with quarterly rebalancing, or at least, that's the goal here

I added as many comments in the source code as possible to help elucidate, step-by-step, how the code operates. Below are some notes I took after messing with it for a bit:

  • More clusters results in lower beta (expected that)
  • Shifting market entry point by 17 days from month's start to try and coincide with the release of earnings reports resulted in -300% returns (unexpected, will have to test further to evaluate the best entry point)
  • Cluster size of 30 seemed to provide the best risk/return tradeoff
  • Kept penalization parameter at 2 for all tests
  • This algo would not survive the 2008-2009 financial crisis

Feedback and criticism are heavily encouraged and appreciated!! Particularly how to go about reducing the risk factors associated with this strategy

6 responses

The algorithm invests mostly on stocks with symbol starts with "A" or "B", there may be some error in the coding.

I noticed that as well. I'll go into the research environment in a bit to verify whether that's intended or not

Here's the revised source code, with backtest. The stocks definitely weren't being ordered correctly; I corrected the way the best stocks were passed through the algo, and modified the algo to liquidate all positions before re-evaluating the optimal portfolio. The leverage should be 1 all the way through, but during the trades for Febraury 2011, May 2011, November 2012, and May 2013 (there were more, but these were the peaks), the leverage would jump to around 1.2-1.5

I'm pretty sure selling and shorting are two separate things, but are they synonymous for leverage? Is the liquidation causing this? Forgive me for my somewhat limited understanding of these risk metrics

so I changed the order logic a bit and dropped the clusters to 15, and the algo seems to perform decently from 2010 to 2017. this is my best effort to keep leverage near 1, and the algo is working mostly as intended, so I'll probably move on from this. so thankful for all the wonderful resources on this website for research

I've simplified the code and made 2 changes:
1. Exclude single stock clusters
2. Maximum allocation to single stock limited to 30%

hey thanks xelio! i wasn't quite sure how to add those kinds of constraints, but your source code makes things much clearer. it'll definitely come in handy for my mean-reversion algo