Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Help doing baktesting

Hi,
I wrote a simple model to predict if the S&P index will go up/down (1/0). My data is from 2008 to 2020. I use a random forest algorithm. I did many backtesting and I have the following results:
1) When I train the model using 70% of the data (70% observations randomly chosen) and predict the index by each year (2008, 2009, ...) the results are very good. The returns are higher than the index in that year.
2) When I train the model using the whole sample (I know that it is wrong) and predict the index by each year (2008, 2009, ...) the results are very good. The returns are higher than the index in that year.
But
3) when I use the year(t-n)...year(t-1) to train the data and predict the index of the following year (2009, 2010,....) the results are very bad.
The training set in this case is growing and I would expect that the predictions would improve, but they don't.
Thanks for helping me to solve this question.