Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
For contest scoring, why does volatility matter if we already have Sharpe?

I see algos with a 50% return and 20% vol rank lower than algos with 5% return and 5% vol. It appears that:

  1. Most of the paper trading scores have really low vol and really low returns (probably due to overfitting in the backtest).
  2. Algos that have acceptable levels of vol are severely penalized, even though they have high Sharpes.
5 responses

The contest metrics heavily favor extremely low volatility. Not only is volatility a metric, but sharpe and calmar effectively have volatility in a denominator, so ultra-low-vol strategies will dominate the ranks for those metrics. Furthermore, ultra-low-vol algos may also rank highly on beta. I haven't worked out how volatility affects stability and consistency, those were introduced after I won and I haven't been following it so closely.

Of course, I can't answer "why", but the contest is what it is.

Of course everything being equal the lower the vol the better.

But that's also true for height among basketball players -- and as we know height isn't everything.

Some people don't trade algos without significant drawdowns!

@ Jonathan Ng,

The context scoring system is unbalanced with heavy weightings to lower negative metrics which I consider not right.

https://www.quantopian.com/posts/request-real-world-strategy-scoring-metric

If yours algo will behave as CD's: 1% AnnReturn, 0% StDev, 0% Beta, 0% MDD it's Sharpe and Calmar Ratios will go to positive infinity.
So 5 of seven scoring metrics will be winners and it dose not metter what it's return is it will be the contest winner or in top 10 list every other month.
The latest additions to scoring metrics co called consistency and stability has negative correlation to sharpe ratio just worsen situation.

https://www.quantopian.com/posts/how-consistent-is-consistency-factor
https://www.quantopian.com/posts/how-stable-is-stability-calculation
https://www.quantopian.com/posts/quantopian-lecture-series-spearman-rank-correlation

I am not supporting nether blue nether green belts.

@ Simon Thornington
I do not agree:
the contest is what it is. `

The contest should be what it should be.

I'm looking at the daily results and I must agree with Vladimir. The contest scoring metrics are currently way too heavily biased towards low-volatility algorithms. I understand that Q wants 'safe' algos for the hedge fund, but this seems like too much bias. To make it worse, since the score now is only derived from live trading, where volatility gets magnified because of fewer trading days, high volatility algorithms don't stand a chance.

I also would like to express my doubts about the decision to abruptly remove the backtest scoring component completely:

  • Algorithms that are winning don't have comparable backtest performance - makes you wonder if they're just getting lucky.
  • The consistency metric is not flawless to say the least, and there is too much faith being put into it's ability to adequately encompass the algorithm's two-year performance - good or bad.
  • Perhaps the backtest component ought to be phased out in a different manner instead - maybe from 30% to 0 over 60 trading days ? My point is that at least some amount of score must be kept for it while there isn't enough live data. The consistency metric already discourages over-fitting quite reasonably since the penalty is massive if your consistency score is low.