Hi I'm trying out something with the pipeline code and making an Exponential Regression factor to calculate the slope of a price series to rank them. I'm however finding it very difficult to access each row of the close ndarray I'm taking in to create the calculation on.
Sorry I'm a bit new to numpy/scipy so perhaps there is a simple way to run a linear regression over the whole array as a function but I couldn't see how to do it. The problem is I can't access each row of the this array from the list assets the factor is passed as I can't find a way to search by label for each of them.
my code snippet:
def compute(self, today, assets, out, close):
x_index = pd.Series(range(self.window_length))+1
close_log_returns = np.diff(np.log(close))
scores=[]
for asset in assets:
asset_returns = close_log_returns[:,asset] # this is breaking when I get to a larger number as asset is a sid like 46549 and this is not in the index
slope, intercept, r_value, p_value, std_err = stats.linregress(x_index, asset_returns)
score = slope * np.sqrt(252) * r_value**2
if score == score:
scores.append([asset, score])
out[:] = scores
I get an error like this:
IndexError: index 8388 is out of bounds for axis 1 with size 8388
Sorry, it's late in the evening/morning here and any help would be appreciated!
Thanks
Michael