I have a created custom factor to detect the presence of a long term trend. In the pipeline the custom factor produces a dataframe with historic data per asset, and passes it to a function that then aims to find a fitting trend and then returns the startdate for that trend. By doing so, I avoid having to do tons of databasecalls.
However, I now get only one variable per asset as output, where potentially I could easily have many more, all as output from the same calculation.
class weeklyConfirmations(CustomFactor):
inputs = [USEquityPricing.low, USEquityPricing.high, USEquityPricing.volume]
window_safe = True
# resamples daily data to weekly
def compute(self, today, assets, out, low, high, volume):
n=5
low_df = pd.DataFrame(low, columns=assets).stack()
high_df = pd.DataFrame(high, columns=assets).stack()
volume_df = pd.DataFrame(volume, columns=assets).stack()
df = pd.concat([low_df, high_df, volume_df], axis=1)
df.columns=['low', 'high', 'volume']
df.index = df.index.rename(['day', 'sid'])
#'Since this historic data does not have a timeseries index, we cannot use the resample
# function to produce weekly data. We use the following instead:
n=5
df2 = df.reset_index()
df2.sort_values(by=['sid', 'day'], inplace=True)
df3 = df2.groupby([np.arange(len(df2))//n, 'sid']).agg({'day': max, 'low': min, 'high': max, 'volume': sum})
df4 = df3.reset_index()
df4.sort_values(by=['day', 'sid'], inplace=True)
df4.set_index(['day', 'sid'], inplace = True)
df4.drop('level_0', axis =1, inplace = True)
# print('with the new index it looks like \n', df4.tail())
def my_df_function(sid_df):
my_result = longTermConfirmed(sid_df)
return my_result
# Rather than looping over each security it's much faster to group by security and apply a function
buy_signal = df4.groupby(level=1).apply(my_df_function)
out[:] = buy_signal
The actual logic resides in the CustomFactor that is referenced as longTermConfirmed(sid_df) in the code above:
def longTermConfirmed(stock_hist):
trendStartDate = np.nan
my_df = pd.DataFrame(columns=['Start date', 'Confirmations','Squared distance', 'Porosity', 'Porosity at latest conf date'])
length = len(stock_hist)
n=2
launchpoints = list(argrelextrema(stock_hist['low'].values, np.less, axis =0, order = n))
newlist = launchpoints[0].tolist()
newlist2 = [i for i in newlist if i < length - 8]
# We consider only trends that are at least 2 months old, hence subtracting 8 weeks
for i in newlist2:
dft = stock_hist.copy().iloc[i:]
trend = calcTrend(dft)
result = confirmTrend(trend, 'S', .002, .005, 1)
if result.empty == False:
datelist = result.index
porosityAtLatestConfDate = result.iloc[-1]['porosity']
result['dist_squared'] = result['distance']**2
confCount = len(result.index)
latestConfirmationDate = datelist[-1][0]
latestRelOBV = result.iloc[-1]['OBV_rel']
squared = result['dist_squared'].sum()
new_data = {'Start date': i,
'Confirmations': confCount,
'Squared distance': squared,
# 'Porosity': porosity,
'Porosity at latest conf date': porosityAtLatestConfDate,
'Latest Confirmation Date': latestConfirmationDate,
'Rel OBV': latestRelOBV}
my_df = my_df.append(new_data, ignore_index = True)
if my_df.empty == False:
my_df = my_df.sort_values(["Confirmations", "Squared distance"], ascending = (False, True))
mask = (my_df['Confirmations'] >= 2) #require at least two confirmations to be considered
df2= my_df[mask]
emptydf2 = df2.empty
if emptydf2 == False:
df2.sort_values(['Latest Confirmation Date','Rel OBV'], ascending = [0,0])
if ((df2.iloc[0]['Latest Confirmation Date'] >= length * 5 -1 )) :
trendStartDate = length * 5 - df2.iloc[0]['Start date'] *5
return trendStartDate
This code is now set up to return the date
trendStartDate
at which a particular uptrend started.
That same longTermConfirmed() function could output additional valuable information (e.g. number of times the trend has been confirmed, etc. etc.) that I intent to use in my algo. As there is some computationally heavy work going on, I obviously want to avoid having to run close variants of that same function, if i could do it all in one pass.
However, I am going completely blank as to how to pass back multiple variables per assets as output from the function longTermConfirmed() - to the customFactor??
Does anyone know how I could solve this?