I have the following schedule function in the initialize function:
schedule_function(handle_data, date_rules.month_start(), time_rules.market_close())
Within the handle_data function I output some results to an empty DataFrame, with a new row added for each new day handle_data is called. The DataFrame has 2 columns, the date and a mean of some prices:
close_price_mean = buy_list.mean()
today = zipline.api.get_datetime().date()
df.loc[today] = pd.Series({'Average portfolio perf':close_price_mean}
When I run the backtest, I am surprised to find that there is a new row for every day of the backtest, rather than only at the start of each month, as the schedule function defines with
date_rules.month_start()
. Does this means that handle_data is actually getting called each day in the background?
The actual backtest results appear to only form portfolios at the start of each month, as you would expect, but it seems that somehow the handle_data function gets called every day as the DataFrame gets data for each day of the backtest?
I would be very grateful if anyone could help me understand what is happening here.