Hi,
I am trying to group stocks based on their exposures to common risk factors. I first identify market wide risk factors and then want to cluster stocks which move similarly w.r.t risk factors using KMeans algorithm. My code is as below:
def get_cluster(stocks, returns, riskfactors):
betas = smapi.OLS(returns, smapi.add_constant(riskfactors)).fit().params.T[:, 1:]
betas = preprocessing.scale(betas)
labels = KMeans(n_clusters=10).fit_predict(betas)
clusters = {}
for i, stock in enumerate(stocks):
label = labels[i]
if label not in clusters:
clusters[label] = []
clusters[label].append(stock)
return clusters
Would someone with clustering experience comment if what I am doing makes sense?
Best regards,
Pravin