Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Zipline benchmark and algorithm returns mislabelled

Hi, a basic question here. I'm using Zipline to run algorithms with custom data.

It seems several of the values returned in the results dataframe are mislabelled, namely benchmark_period_return, algorithm_period_return, and return. I found a comment stating something to the same effect somewhere in the bowels of the code in cumulative.py. Basically, 'benchmark_period_return' actually contains the cumulative returns, NOT the period returns. Similarly, algorithm_period_returns contains the algorithm cumulative returns, and NOT the period returns. Lastly, 'returns' contains the period percentage returns (unlike what the QT help on the portfolio object states: "cumulative percentage returns for the entire portfolio up to this point").

Is this documented anywhere? It seems like a pretty important issue that needs to be addressed. In fact, is there any documentation on the variables and calculations returned in Zipline?

# These are the adjusted close values for ^GSPC 

In [122]:  
gsp_adj_close = pd.Series([1218.890015,  
1204.420044,  
1173.969971,  
1165.23999,  
1198.619995,  
1185.900024,  
1154.22998,  
1162.27002])

gsp_adj_close  
Out[123]:  
0    1218.890015  
1    1204.420044  
2    1173.969971  
3    1165.239990  
4    1198.619995  
5    1185.900024  
6    1154.229980  
7    1162.270020  
dtype: float64         

# Calculate the percent gains

In [124]:  
benchmark_pct_gains = gsp_adj_close.pct_change(1)[1:]  
In [125]:  
benchmark_pct_gains  
Out[125]:  
1   -0.011871  
2   -0.025282  
3   -0.007436  
4    0.028646  
5   -0.010612  
6   -0.026705  
7    0.006966  
dtype: float64

# Calculate the benchmark cumulative returns. They are identical to benchmark_period_returns. Bad nomenclature!

In [126]:  
empyrical.cum_returns(benchmark_pct_gains) - perf.benchmark_period_return.values  
Out[126]:  
1    0  
2    0  
3    0  
4    0  
5    0  
6    0  
7    0  
dtype: float64


# Calculate the percent change in the portfolio values as returned. Compare to 'returns' column  
# They are identical, while, according to QT help, 'returns' should contain cumulative returns,  
# and NOT period returns

In [127]: perf.portfolio_value.values  
Out[127]:  
array([ 1000000.        ,   999727.67509375,   996169.87153552,  
         980632.62051827,   966563.23558839,   951299.16259529,  
         946291.48249317])  
In [131]: perf.portfolio_value.pct_change().values - perf.returns.values  
Out[131]:  
array([             nan,   4.18502039e-17,  -1.08420217e-17,  
        -3.29597460e-17,   4.16333634e-17,   5.20417043e-17,  
         2.68882139e-17])  

# algorithm_period_return should be period returns, but is actually cumulative returns  
In [139]: empyrical.cum_returns(perf.returns).values - perf.algorithm_period_return.values  
Out[139]: array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.])