@Rene Ordosgoitia Very good question.
The first issue is the "mean period wise return by quantiles" is the mean of the daily mean returns. It isn't simply the mean of all the returns. Why is this? Alphalens make several assumptions about how the factor values will be used and traded. The calculations are designed around these assumptions.
- There may be a different number of stocks in each quantile each day
- One will trade all the stocks in each quantile and equally weight each stock. Therefore if 20 stocks are in quantile x on day 1, each would get a weight of 1/20. If 100 stocks are in quantile x on day 2, each would get a weight of 1/100.
- The net return on a given day for a given quantile is the average return of the stocks in that quantile for that day.
- The net return over several days for a given quantile is the compounded return of the average of the daily returns.
Therefore, to calculate the "mean period wise return by quantiles" one first needs to find the mean daily return, and then second, find the mean of those daily returns. Something like this for starters (which happens to be almost exactly how Alphalens actually does the calculations)
# First create a 'grouper' to group by date
day_grouper = ['factor_quantile', merged_data.index.get_level_values('date')]
# Now use that grouper and get the mean of each returns column
daily_means = merged_data.groupby(day_grouper)['1D','2D','3D','4D','5D'].mean()
# Next, create a 'grouper' to group by our factor quantiles
quantile_grouper = [daily_means.index.get_level_values('factor_quantile')]
# Now use that grouper and get the mean of the daily means by factor quantile
quantile_means = daily_means.groupby(quantile_grouper).mean()
That will calculate the "n day total mean return" for the given period. However (one more step), in order to make comparisons between different periods easier, Alphalens 'normalizes' all the values to the compounded daily return. In other words, the daily return, which if compounded over n days, will be the n day total mean return. This is then an apples to apples' comparison of the different n day returns. Something like this using pseudocode (which again is similar to how Alphalens actually does the calculations)
quantile_daily_returns['nD'] = quantile_total_returns['nD'].add(1).pow(1/n).sub(1).mul(10000).round(3)
That's it. Take the mean of the daily mean returns and then calculate the compounded daily rate of return.
However (there's always something else), this won't match the results from the default create_full_tear_sheet method. Why is that? The create_full_tear_sheet method, by default, assumes a long short portfolio. It assumes one will short stocks in the lower quantiles and long the stocks in the upper ones. The mean returns we calculated above just took the raw returns and didn't account for long and short. The short returns would need to be inverted. Won't go into that calculation here. Rather, there is a long_short parameter to force the create_full_tear_sheet method to assume an all long portfolio. Setting this parameter to False will result in an all long portfolio with returns that match our returns calculated above.
create_full_tear_sheet(merged_data, long_short=False)
It's generally a good idea to start a factor analysis with an all long portfolio (ie specifically set 'long_short=False'). I'd recommend it. The long short portfolio which Alphalens models is split along the mean factor values each day. It assumes one will long the upper half and short the lower half. However, this may not always make sense. Often only the highest or lowest quantiles of a factor show negative returns. It may make sense to then only short those quantiles and not arbitrarily short the entire bottom half. If long_short is set to True or left to the default, it hides what is actually going on.
Attached is a notebook showing the manual calculations for getting the "mean period wise return by quantiles" and how they match those returned by the tear sheet.
Good luck.