Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Lecture Series Long-Short Cross Sectional Momentum Row and Column Data

Hi,

I understand the purpose behind the line (R.T - R.T.mean()).T.mean()) however I am confused as to what is in the rows/columns prior to this.

Let me know if I have misunderstood any of the following;

This adds market cap to the columns so rows = company name and columns = mkt cap.

pipe_columns = {
'Market Cap' : mkt_cap()
}

Shows only the top 500 mkt comanies above mkt cap or 1e8 (still rows = companies, columns = mkt cap)

context.output = pipeline_output('pipeline')
context.output[context.output['Market Cap'] > 1e8]
context.output.sort(['Market Cap'], ascending=False, inplace=True)
context.security_list = context.output.head(500).index
log.info(context.security_list)

Takes the log of prices for the top 500 mkt cap companies as per the above matrix

prices = np.log(data.history(context.security_list, 'price', context.lookback, '1d')).dropna(axis=1

Gives log returns over the return_window period

??Also what is the purpose of the second line? Doesn't the .dropna() perform the function of the np.isfinite?

R = (prices / prices.shift(context.return_window)).dropna()
R = R[np.isfinite(R[R.columns])].fillna(0)

My confusion is here

ranks = (R.T - R.T.mean()).T.mean()

My understanding is the companies are in the rows and log returns in columns. Transposing this wouldn't give a cross sectional average as the default axis of 0 would already give this without transposing. So my assumption is that I'm wrong about what is in the rows and columns but can't work out why.

Can anyone explain why log prices are in the rows and company names are in the columns (if that is the case)

Thanks