Contest scoring for algo with out of sample returns <= 0

Back to Community

posted

Hi,

The top of the leaderboard now are filled with overfitted algos that perform unrealistically well in sample but have 0 or negative return out of sample.

I suggest adding new contest rule that all algos with out of sample return <= 0 are not eligible to be winner, regardless of their ranking / score.

Best,

4 responses

Dan Dunn

In the beginning of the month the scoring can be pretty volatile. In particular, when you see an algorithm that was submitted on May 1st, it only has 6 days of paper trading so far. That means that the score is still 2/3 based on the backtest score. Every day the backtest score matters less and the paper trading score matters more. I am as sure as I can be that, in time, the scoring system will do what it's supposed to and the cream will rise to the top. It just takes a few days to settle out.

That said, I agree that it's confusing. It's my fault, too! We had an internal debate about what to do with algorithms that haven't traded yet. I advocated for the current behavior because I was aiming for a warm onboarding experience for new algo writers. I wanted them to feel good about their new entries. So, I tailored the rules to give the benefit of the doubt to aglos that haven't done anything yet.

Every decision has unintended consequences. The unintended consequence here is that we have an algo currently ranked at #2 that really shouldn't be ranked that high.

I'm working on some rule changes for the June contest, and I'll do something to make sure that the algorithms that don't trade aren't as confusingly high-ranked in the first few days of the contest.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Anh Nguyen

Hi Dan,

I understand that every approach has its flaws, and every system can be gamed with enough efforts. That said, I feel that a minimum filter on out of sample return and Sharpe should be applied to eliminate algos that don't deserve to even be considered as contenders.

Another thing that you might want to consider is to require minimum daily position. It is only fair to compare algos if they are all active. Current scoring system (I assume unintentionally) favors inactive algos. After all, I believe the goal is to find algos that invest money in the market and make profit, not algos that do nothing (or mostly nothing) with the money given.

Best,

Kevin Quilliam

Anh,
I'm sitting at number 3 behind a number 2 algorithm that hasn't traded at all yet, and my first instinct when I realized that was similar to yours. However when I thought about it a bit I realized that my first conclusion was incorrect. Imagine an algorithm that scans thousands of parameters every minute looking for the rare trade that is positive the vast majority of the time and when it isn't it loses only a small amount. Because that is a rare find, it only averages 1 trade a month that it holds for only a few hours. Would you rather put your money there, or with someone who puts positions on every day and has a worse sharpe ratio, return and drawdown profile, beta... because we arbitrarily decided that having a position on was more important than results? If fact I don't even think option 1 is especially far fetched, as it's pretty much the definition of finding and exploiting a good arbitrage opportunity.
One area I'm still thinking about is if it's fair to make algorithms with a substantial paper trading background compete with those who get to use their backtest for 50% of their results. You see the unintended consequence of this when you have an algorithm that has a month or more of paper trading under its belt but is underwater because that wasn't a great month. The most rational thing to do is pull the algorithm and resubmit it, even if you don't intend to change it at all, because 50% of your score will be an average of the last two years of paper trading. If you just left it running, that 50% would be composed of just your lackluster results from the past month. As a result, Quantopian is disincentivizing us from continuing to run our algorithms in paper trading through ups and downs over long periods of time, which yields the most accurate results, in favor of repeatedly resetting them. I understand the need to attract new talent, but not sure that's worth incentivizing the opposite of the behavior you want. The best solution is two contests, one for new talent and one for existing algorithms, but I understand that would either double Quantopian's burn rate or half the prize.

Anh Nguyen

Kevin,

Your logic is correct. Long term return is the ultimate goal. However, since the code cannot be reviewed to make sure it is actually finding rare opportunity rather than playing the system, we need to rely on statistical metrics. And statistics only work if you have large enough sample. Sharpe, draw down, stability... are not as reliable if the algo only stays in the market once in a while. It is not a fair to compare those metrics between active and inactive algos. That is the point I am trying to make.

And I agree with you, it is all about the incentives. Right now, I feel like the chance to win by playing the system is still significantly higher than doing something that actually work in the long run.

You've successfully submitted a support ticket.

Our support team will be in touch soon.