Summary:
Quantopian has created empyrical , an open source library that calculates risk and performance metrics. empyrical will soon be used by other libraries on Quantopian, like zipline and pyfolio. The use of empyrical will change the calculations used by the Quantopian backtester.
When this change rolls out, the risk and performance metrics (like Sharpe ratio) for new backtests will not match those from previously generated backtests.
Before we update the backtester, we're requesting community members review our calculation methods in empyrical. We'd love to hear your feedback.
Detailed Description
We've been grappling for some time with a set of problems related to how we calculate and display risk and performance metrics in our products - metrics such as Sharpe Ratio, Max Drawdown and others. As many in the community have helpfully pointed out, we have been inaccurate in some of our calculation methods, especially within zipline (which is used in the Quantopian backtester). Furthermore we were inconsistent in our calculation methods across zipline and pyfolio. As a result the backtester frequently displayed a different value for metrics (like Sharpe Ratio) as compared to pyfolio.
In order to solve these accuracy and inconsistency problems, we've created a unified library for use by zipline and pyfolio. This open-source library is empyrical. empyrical will be deployed to the Quantopian site in the coming days within pyfolio and the Quantopian backtester.
This rollout brings benefits such as consistent and more accurate calculations across Quantopian. Furthermore, because the library is open source, you (the Quantopian community) can examine the methods used for calculating the metrics.
This move to empyrical brings changes to our metrics calculations. The calculations previously made prior to the use of empyrical in the backtester will be different from new calculations. Specifically, following metrics are impacted:
- Max Drawdown
- Sharpe Ratio
- Sortino
- Downside Risk
- Information Ratio
- Alpha
There are numerous reasons the values have changed. I won't outline each of them here. Generally, our testing has shown that most impacted metrics like Sharpe Ratio will be lower with empyrical. This will not universally be true.
Our intention here is to communicate some key items to the community:
- Our methodology for calculating risk and performance metrics in the backtester will be changing in the near term so previous backtester calculations will not match future calculations. You will see the changes in the coming days both in the backtester and the contest leaderboard. The timeline of the rollout will be, in part, influenced by the feedback that we get.
- The methodology for pyfolio calculations remains the same -- it will simply use empyrical.
- We're keen for the community to investigate and provide us any feedback on our calculation methodologies. The platform has benefited in the past from your insight and feedback and we hope that empyrical will be another opportunity for you to contribute.
If you have questions on the changes, especially the changes in our methodology, feel free to ask them here -- or check them out for yourself in the empyrical repo.
Happy coding,
Josh