Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
NaN Correlation Coefficient

When I run this code, np.corr.coef returns NaN. Does anyone know why? I don't have this issue with other securities.

Pull the pricing data for our ETFs

start = '2010-01-01'
end = '2016-01-01'
XLY = get_pricing('XLY', fields='price', start_date=start, end_date=end) #consumer discretionary
XLF = get_pricing('XLF', fields='price', start_date=start, end_date=end) #financials

plt.scatter(XLY,XLF)
plt.xlabel('XLY')
plt.ylabel('XLF')
plt.title('ETF prices from ' + start + ' to ' + end)
print "Correlation coefficients"
print "XLY and XLF: ", np.corrcoef(XLY,XLF)

1 response

I think it has to do with the fact that these ETFs have NaN price values at certain points (not sure why...) but when I use np.nan_to_num() on both data sets I get a correlation of .95. However, is using np.nan_to_num() giving me a severely high artificial correlation?