So over 50% of my Pipeline generation time is spent generating prior returns in a custom factor, via the code below. I'm populating and using my own prices
DataFrame by design. I did some crude timing tests and the vast majority (by ~ 2 orders of magnitude) of time is spent selecting the relevant asset prices via prices[[symbols(sid) for sid in assets]]
. Can the performance of this code be improved?
def compute(self, today, assets, out, close):
todays_price_index = prices.index.get_loc(today)
prices2 = prices[[symbols(sid) for sid in assets]].as_matrix()
out[:] = (prices2[todays_price_index] - prices2[(todays_price_index - self.window_length)]) / \
prices2[(todays_price_index - self.window_length)]