The Pipeline API is a great concept but for moment it's quite difficult to implement algorithms that use historical fundamentals data without incurring in a out-of-Memory or timeout error.
I already exposed the problem in details in this post:
https://www.quantopian.com/posts/error-in-fundamental-data#56f5da99f44e1db34b00064e
Given that all the fundamentals in the Pipeline API are quarterly, you currently need to load about 260, 456 or even more days of data to compute annually TTM data, for example:
quarter_lenght = 65
ttm = [ -1, -quarter_lenght, -2*quarter_lenght, -3*quarter_lenght]
ttm_py = [-4*quarter_lenght, -5*quarter_lenght, -6*quarter_lenght, -7*quarter_lenght]
class NetIncomeChange(CustomFactor):
window_length = 7*quarter_lenght + 1
inputs = [morningstar.income_statement.net_income]
def compute(self, today, assets, out, net_income):
net_income_ttm = np.sum(net_income[ttm], axis=0)
net_income_ttm_py = np.sum(net_income[ttm_py], axis=0)
out[:] = (net_income_ttm - net_income_ttm_py) / net_income_ttm_py
My suggestion to avoid an exaggerated memory consumption, is to provide also the possibility to specify only the indexes of the required data points instead of the full window length, for example something like that:
window_datapoints = [1, -quarter_lenght, -2*quarter_lenght, -3*quarter_lenght, -4*quarter_lenght, -5*quarter_lenght, -6*quarter_lenght, -7*quarter_lenght]
would for sure consumes a lot less resources than the current:
window_length = 7*quarter_lenght + 1
What do you think about? Is it doable?
Thanks,
Costantino