Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Calculation of correlation in pipeline

Hello,
what could be the good way to calculate the correlation between two quantities in the quantopian pipeline ?

I was trying

class PriceFactCorr(CustomFactor):
inputs = [alphaone.article_sentiment, USEquityPricing.open]
window_length = 100

def compute(self, today, assets, out, sentiment, open_price):  
    out[:] = np.corrcoef(sentiment, open_price)[0, 1]

but it gives all Nans

Equity(693 [AZO]) NaN
Equity(1374 [CDE]) NaN
Equity(1582 [CL]) NaN
......

Did someone found a good solutions?

Thanks

Francesco

4 responses

Hey Francesco,

Wondering what the [0,1] is for?

I use the following code and it always works well:

Correlation = np.corrcoef(Factor_1, Factor_2)  

Does your code create a pipeline class that is supposed to calculate the correlation between sentiment 100 days ago and open price 99 days ago? If so, I think (and I welcome comments if I am incorrect) the code might work if you do this instead:
out[:] = np.corrcoef(sentiment[0], open_price[1])

If you are trying to get the correlation between sentiment yesterday and open_price today, I think you would want to decrease the window length. Or maybe better to try this:

out[:] = np.corrcoef(sentiment[-2], open_price[-1])  

I didn't test it, but something to try.

Thanks for the useful suggestion Frank, I will try those!
Regarding your suggestion

Out[]=np.corrcoeff(sent[-2]...  

I do not agree, I believe that correlation should be calculated on a time series and not only two points since it is the cov(x,y) / sigma_x sigma_y and should include some time points I guess
https://en.m.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient

[0,1] then refers to off diagonal elements of the correlation matrix returned by numpy --> (X,Y) [0, 0] would have been (X,X)

Francesco,

My mistake. I forgot that factor_1, and factor_2 from my basic example are lists of said factors. I wonder if something like this would do the trick:

np.corrcoeff(sentiment[0:99], open_price[1:100])

could you share your notebook?