This algo should, in theory, find stocks that tend to fluctuate with each other. Based on example from the sklearn website.
My program seems to hang sometimes for no reason - perhaps someone can help?
With a couple of stocks it seems to work fine:
2011-06-01 handle_data:45 INFO Start date: 2011-06-01 00:00:00+00:00
2012-05-30 handle_data:61 INFO Finished recording data : 251 days
2012-05-30 handle_data:65 INFO Have 7 complete histories
2012-05-30 handle_data:81 INFO 3 groups found:
2012-05-30 PRINT Cluster 1: 4707, 5061, 20486, 3149
2012-05-30 PRINT Cluster 2: 24
2012-05-30 PRINT Cluster 3: 18522, 5885
One problem is not being able to look up the name of SIDs. And being limited to 10 SIDs in total means that more general analysis can't be done.
Interesting all the same :)
Perhaps someone could check it with a bunch of unrelated stocks and a couple known to co-fluctuate?