So I may have overrated my own python/quantopian skills, but I really wanna try this one out.
I am trying to make a code calculating correlations of a set of securities. After each calculation I want to pair up the securities which has the highest correlation and then calculate the spread and lastly calculate the standard deviation to find the entry points for a long and short.
Same principles as Delaney Grnizo-Mackenzie does it in this post, https://www.quantopian.com/posts/how-to-build-a-pairs-trading-strategy-on-quantopian, although I'd like a set of securities and then let the algorithm pick out the best ones for me each day, week or month.
What I have now:
import pandas as pd
import numpy as np
def initialize(context):
context.assets = [sid(23112),sid(8347),sid(6653),sid(21839),sid(24),sid(5061)]
schedule_function(my_rebalance, date_rules.every_day(), time_rules.market_open(hours=1))
def my_rebalance(context,data):
assets = context.assets
price_history = data.history(assets, "price", 60, "1d")
corr_matrix = price_history.corr()
corr_matrix[corr_matrix > 0.9999] = 0
max_matrix = corr_matrix.max()
log.info(max_matrix)
The variable "corr_matrix" gives me the correlation for each security and then I use the .max() to figure out which ones has the highest correlation.
log.info gives me following output:
Equity(23112 [CVX]) 0.949948
Equity(8347 [XOM]) 0.819015
Equity(6653 [T]) 0.842697
Equity(21839 [VZ]) 0.949948
Equity(24 [AAPL]) 0.819015
Equity(5061 [MSFT]) 0.848387
dtype: float64
So obviously the securities with same correlation number should be paired, as they are showing the correlation between each other.
But this is where I get kinda lost.
Also I am pretty sure this isn't the most optimal way of doing the coding, but any guiddance would be much appreciated.