Hey everyone. Later today, we are shipping an update that will change the behavior of pipelines that have comparisons involving BoundColumn objects. Specifically, making a comparison to a BoundColumn
used to compare the types of the two objects being compared, and rank them based on lexicographic order of the type name. Comparisons involving BoundColumns
would happen if you ever forgot to add .latest
when defining a pipeline term, leading to unexpected results.
For instance, the following would happen (in Python 2):
EquityPricing.close > 10
>>> True
This evaluated to True
because the type of EquityPricing.close
(zipline.pipeline.data.dataset.BoundColumn
) is lexicographically greater than the type of 10 (int
).
Instead, the intention was probably to define a filter using .latest
:
# This results in a pipeline Filter.
EquityPricing.close.latest > 10
If you made a comparison involving a BoundColumn
and didn't compound it with other filters (using &
or |
), then you actually would have received an error message, because pipeline would have expected a Filter
and instead would have received a bool
. However, if you did compound a BoundColumn
comparison with other filters, your BoundColumn
comparison would have effectively been ignored. For example:
my_universe = (
QTradableStocksUS()
& (Fundamentals.mkt_val.latest > 500e6)
& (EquityPricing.close < 500) # This would have evaluated to `True` and effectively been ignored.
)
Our expectation is that whenever this comparison was made, it was just a typo where the author forgot to add .latest
. As a result, such a comparison should raise an exception. Going forward, a comparison involving a BoundColumn
will raise an exception that includes a message like this:
EquityPricing.close > 10
>>> "TypeError: Can't compare 'EquityPricing.volume' with 'int'. (Did you mean to use '.latest'?)"
This means that algorithms that included a comparison involving a BoundColumn
in the past may have run to completion previously, but will now start to raise an exception.
How did this come up?
We have been working on upgrading the Q API to Python 3 ahead of the Python 2 EOL on Jan 1, 2020. Part of that work has involved trying to build a tool to help you upgrade your code from Py2 —> Py3. To test that tool, we’ve been applying it to public algorithms shared in the community, and we found an example that revealed this issue. In Python 3, the behavior where a comparison between two objects defaults to comparing the lexicographic order of the type name was removed. As mentioned above, we feel that the behavior in Python 3 is more correct than what we were seeing in Python 2 before, so we decided to update pipeline with the new exception.
Note: We will make a forum post at some point in the near future providing more details on the Python 3 work we are doing and how it might affect you.
Please let me know if you have any questions.