It doesn't make sense to me that performance for 6 month is treated the same as performance for 1 year. This is comparing apple to orange. Given two algo with same performance, I would trust the one with 1 year out sample a lot more.
Why don't we compare apple to apple, by scoring all contest algos against the same out of sample period from when the contest start to when the contest end?
Best,