@Ben. The if_else
method expects two factors as arguments. It doesn't expect an integer (in this case 0). That is what's causing the error TypeError: Filter.if_else() expected a value of type ComputableTerm for argument 'if_true', but got int instead.
. The numpy where
method you tried fails for the previous reason that factors are not actual data, or arrays, and therefore cannot be used as inputs to numpy functions.
One could use the if_else
method again if the total stocks are less than a minimum number. A 'trick' is that the top
method will return zero assets if the N parameter is less than or equal to 0. Something like this
# get the minimum of either buys or sells
min_of_buys_sells_signals = (buys_count < sells_count).if_else(buys_count, sells_count)
# require a minimum number of buys and sells
# if our count is less than this minimum then set to a negative number
# setting to a negative number will return zero assets when using the `top` method
min_required_buys_and_sells = 60
negative_count = min_of_buys_sells_signals - min_required_buys_and_sells
count_less_than_min_required = min_of_buys_sells_signals < min_required_buys_and_sells
sigs_today = count_less_than_min_required.if_else(negative_count, min_of_buys_sells_signals)
# will not return any assets if `sigs_today` is negative
buys = indicators.buy_sigs.top(sigs_today)
sells = indicators.sell_sigs.top(sigs_today)
This should do what you want. A word of caution. Pipeline will cause an error in the notebook environment if no rows are returned for any dates. It's OK if it doesn't return anything for some dates, however nothing for all dates get's it confused. This can happen if the minimum value is set too high and there are never any buys or sells. (Setting the minimum value to 100 will cause this).
All that said, this could probably be done in a more clear fashion after the pipeline is run. One can then use standard python, numpy, and pandas. Pipeline is really limited to factor and filter methods, which as we are seeing, are a bit limited at times. In fact, I often use pipeline solely to fetch data with no logic. Completely separate getting the data from manipulating the data. Then, once pipeline is run, the data can be manipulated, sliced, and diced as needed. The separation of data and code may be more intuitive to some.
There are really only two drawbacks to doing all the logic after pipeline is run. First, a pipeline definition can typically be copied and pasted from a notebook to an algo in the IDE and it will run the identically in both environments. However, because a pipeline returns a multi-indexed dataframe in a notebook and a single-indexed dataframe in an algo, the code for manipulating the datafames will be different. Having the same code for both makes for an efficient development workflow. Second, if the logic is encapsulated within a pipeline it can more easily be analyzed using Alphalens. Anyway, something to consider. It's personal preference.
Check out the attached notebook. It should be working as you expected and setting a 'floor' for the number of buys and sells.