The pipeline code you have is simply the 'definition' of the pipeline. It defines the columns to be returned in dataframe (ie the data) and a screen which 'filters' the rows to be returned to only a subset of securities (the dataframe index are the securities). By definition, the resulting dataframe outputted by the pipeline is sorted by the security SID number. You can't specify a sort order in the pipeline definition.
However, you certainly could sort the dataframe once it's generated. You will need to add a column for 'market cap' so you have something to sort on. See https://www.quantopian.com/help#built-in-factors Look at the very end of the list of factors for 'MarketCap'.
# Market Capitalization is used enough that it's included as a built in factor.
# It can also be found under the Morningstar fundamentals too
# Make sure you import the class before using it
import quantopian.pipeline.factors as Factors
def make_pipeline(context):
rolling_correlations = RollingLinearRegressionOfReturns(
target = context.regression_target,
returns_length = context.regression_returns_length,
regression_length = context.regression_length,
mask = context.base_universe
)
beta = rolling_correlations.beta
should_buy_beta = beta < context.beta_threshold
securities_to_trade = (context.base_universe & should_buy_beta)
market_cap = Factors.MarketCap()
return Pipeline(
columns = {
'beta': beta,
'should_buy_beta': should_buy_beta,
'securities_to_trade': securities_to_trade
'market_cap': market_cap
},
screen = (securities_to_trade)
)
When you get the pipeline data (typically in the 'before_trading_start' method) it will now have a column called 'market_cap' which you can sort on.
todays_data = pipeline_output('my_pipeline')
todays_data.sort_values('market_cap', ascending=False, inplace=True)
The pipeline output will now be in the dataframe 'todays_data' and it will be sorted by market cap.
However, you may not really want a sorted list but rather you just want the top 20 (or some other number) of the largest market cap stocks. If that's what you really want, then that CAN be done in the pipeline definition. Simply use the '.top' method to screen the output (documented here https://www.quantopian.com/help#quantopian_pipeline_filters_Filter scroll up the page a bit).
import quantopian.pipeline.factors as Factors
def make_pipeline(context):
rolling_correlations = RollingLinearRegressionOfReturns(
target = context.regression_target,
returns_length = context.regression_returns_length,
regression_length = context.regression_length,
mask = context.base_universe
)
beta = rolling_correlations.beta
should_buy_beta = beta < context.beta_threshold
securities_to_trade = (context.base_universe & should_buy_beta)
market_cap = Factors.MarketCap()
return Pipeline(
columns = {
'beta': beta,
'should_buy_beta': should_buy_beta,
'securities_to_trade': securities_to_trade
'market_cap': market_cap
},
# Set the screen to however many stocks you want returned. Make sure to include the mask
screen = (market_cap.top(20, mask=securities_to_trade))
)
Notice the screen is set to just the top 20 stocks by market cap. Make sure you set the mask to your 'securities_to_trade'. This ensures that the '.top' method will only look at those specific stocks and only return the top 20 WITHIN THOSE STOCKS. Note that it may return fewer than 20 if there are fewer than 20 stocks in 'securities_to_trade'