Hi,
I am trying to run a linear regression on two datasets, using code from the lecture series, and I am having a hard time getting it to run. Instead I get the following error: "TypeError: unhashable type: 'slice".
This is the code I'm running:
from statsmodels import regression
import statsmodels.api as sm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
market = pd.read_csv('c:/users/myname/documents/python/benchmark.csv', index_col='quote_date', parse_dates=True,encoding = "ISO-8859-1")
data = pd.read_csv('c:/users/myname/documents/python/stock.csv', index_col='quote_date', parse_dates=True,encoding = "ISO-8859-1")
close_pd = data['close']
close_pm = market['close']
close_pd = close_pd.sort_index(ascending=True)
close_pm = close_pm.sort_index(ascending=True)
return_data = (close_pd.pct_change()[1:])
return_market = (close_pm.pct_change()[1:])
start = '2006-01-10' #arbitrary date
end = '2017-01-27'
return_data = return_data.loc[start:end]
return_market = return_market.loc[start:end]
#So far so good, then the following regression is giving me problems:
def linreg(X,Y):
X = sm.add_constant(X)
model = regression.linear_model.OLS(Y, X).fit()
a = model.params[0]
b = model.params[1]
X = X[:, 1]
X2 = np.linspace(X.min(), X.max(), 100)
Y_hat = X2 * b + a
plt.scatter(X, Y, alpha=0.3) # Plot the raw data
plt.plot(X2, Y_hat, 'r', alpha=0.9); # Add the regression line, colored in red
plt.xlabel('X Value')
plt.ylabel('Y Value')
return model.summary()
linreg(return_data,return_market)
And this is the output I get:
Traceback (most recent call last):
File "<ipython-input-50-529b2b154f45>", line 24, in <module>
linreg(return_data,return_market)
File "<ipython-input-50-529b2b154f45>", line 7, in linreg
X = X[:, 1]
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\frame.py", line 1997, in __getitem__
return self._getitem_column(key)
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2004, in _getitem_column
return self._get_item_cache(key)
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1348, in _get_item_cache
res = cache.get(item)
TypeError: unhashable type: 'slice'
I think this might be related to the datatype (pandas.core.series.Series), but I cannot figure out what to do. I made an attempt at transforming the data into an array, but this only seemed to cause more problems, and removed the indexed date-element.
Any help is very much appreciated!
Thanks in advance,
- Sondre