Notebook

Calculating Log Returns With Pipeline¶

A common need when using the pipeline API is the ability to apply a simple mathematical function to every value produced by a Factor. This operation is so common, in fact, that the Factor base class has a large number of methods for applying common scalar transformations.

A useful application of this functionality is calculating daily log-returns, which are given by taking log(1 + return).

In [2]:
import numpy as np
from quantopian.research import run_pipeline
from quantopian.pipeline import Pipeline
from quantopian.pipeline.factors import Factor, Returns, CustomFactor

Simple Daily Returns¶

The Pipeline API comes with a builtin factor for simple close-to-close returns.

In [3]:
# window_length is 2 because daily returns are calculated by comparing two days' worth of close prices.
daily_returns = Returns(window_length=2)

"Manual" Log Returns¶

We can build a LogReturns factor relatively easily using CustomFactor...

In [4]:
class LogReturns(CustomFactor):
    inputs = [daily_returns]
    window_length=1
    
    def compute(self, today, assets, out, returns):
        # This is equivalent to, but slightly faster than `out[:] = np.log1p(returns)`.
        # Many numpy ufuncs provide an `out` parameter for speedups when 
        # the desired behavior is to copy the result of a calculation into an already-
        # existing array.
        np.log1p(returns.ravel(), out=out)
        
manual_log_returns = LogReturns()

Pipeline Log Returns¶

... but we don't actually need to! Factor has a log1p method that behaves identically to our custom LogReturns.

In [5]:
builtin_log_returns = daily_returns.log1p()
In [6]:
pipe = Pipeline({
    'returns': daily_returns, 
    'manual_log': manual_log_returns,
    'builtin_log': builtin_log_returns,
})
pipe.show_graph('png')
Out[6]:
In [7]:
result = run_pipeline(pipe, '2014', '2014-02')
In [8]:
result.head(10)
Out[8]:
builtin_log manual_log returns
2014-01-02 00:00:00+00:00 Equity(2 [ARNC]) 0.009452 0.009452 0.009497
Equity(21 [AAME]) 0.052446 0.052446 0.053846
Equity(24 [AAPL]) 0.011939 0.011939 0.012011
Equity(25 [ARNC_PR]) NaN NaN NaN
Equity(31 [ABAX]) -0.002994 -0.002994 -0.002990
Equity(39 [DDC]) 0.015450 0.015450 0.015570
Equity(41 [ARCB]) -0.004739 -0.004739 -0.004728
Equity(52 [ABM]) -0.007666 -0.007666 -0.007636
Equity(53 [ABMD]) -0.027682 -0.027682 -0.027303
Equity(62 [ABT]) -0.001564 -0.001564 -0.001562
In [9]:
(result.manual_log.fillna(0.) == result.builtin_log.fillna(0.)).all()
Out[9]:
True

Other Built-in Methods¶

Under the hood, Factor uses the excellent numexpr library to efficiently execute scalar transformations of pipeline values (incidentally, this is also how most of the binary operators like + and - work on Factor).

As of the time of this writing, the full list of the supported numexpr scalar operations is defined in Zipline here:

In [14]:
[
    Factor.abs,
    Factor.arccos,
    Factor.arccosh,
    Factor.arcsin,
    Factor.arcsinh,
    Factor.arctan,
    Factor.arctanh,
    Factor.cos,
    Factor.cosh,
    Factor.exp,
    Factor.expm1,
    Factor.log,
    Factor.log10,
    Factor.log1p,
    Factor.sin,
    Factor.sinh,
    Factor.sqrt,
    Factor.tan,
    Factor.tanh,
]
Out[14]:
[<unbound method Factor.abs>,
 <unbound method Factor.arccos>,
 <unbound method Factor.arccosh>,
 <unbound method Factor.arcsin>,
 <unbound method Factor.arcsinh>,
 <unbound method Factor.arctan>,
 <unbound method Factor.arctanh>,
 <unbound method Factor.cos>,
 <unbound method Factor.cosh>,
 <unbound method Factor.exp>,
 <unbound method Factor.expm1>,
 <unbound method Factor.log>,
 <unbound method Factor.log10>,
 <unbound method Factor.log1p>,
 <unbound method Factor.sin>,
 <unbound method Factor.sinh>,
 <unbound method Factor.sqrt>,
 <unbound method Factor.tan>,
 <unbound method Factor.tanh>]

Most of these probably aren't useful (I'm not sure who would want to take the inverse cosine of a price series), but there are certainly applications for log*, exp*, and sqrt. We get the rest for free with some fancy metaprogramming tricks, so if you really want to, you can calculate inverse-hyperbolic-tangent-returns :).