Hi all,
We recently did some thinking about how the transforms should work. I think I have a clearer view of what the issues are but I wanted to run this by the community to make sure it is the right thing to do.
The central idea is that we never care about absolute time when specifying a window length, but we really care about trading time. For example, when I say give me a 2 day moving average and it's Monday I want Fri and Thurs.
That's the way transforms have been working. However, we also want to support finer granularity. So the proposed interface would look something like this:
data[sid(24].mavg(days=3)
data[sid(24].mavg(minutes=10)
The question is how this should behave in certain corner cases (market open 9:30am, close: 4pm). Some examples:
1. Current: 9:40 am Fri, 20 minute window.
reaches back until: 3:50pm, Thurs
2. Current: 2pm Fri, 1 day window
reaches back until: 2pm Thurs
3. Current 2pm Fri, 1 day window. However, Thurs is half-day, market closed at 12pm.
reaches back until: Thurs 12pm
The algorithm to implement this (in pseudo code) would be as follows:
t1 = cur_time - convert_to_calendar_days(cur_time, days)
t2 = t1 - minutes
reach_back_until = move_to_earliest_trading_time(t2)
where days and minutes is the user parameter,
cur_time is the current time,
convert_to_calendar_days() calculates how many actual days have passed to give 'days' trading days.
move_to_earliest_trading_time() converts e.g. 9:20am -> 3:50pm on the previous day
Now, maybe you don't always want to specify a strict time window. Especially for non-liqud stock that only trades e.g. every 10 minutes, a 10 minute mavg would be kinda senseless. Instead, we could add an option that would say, "give me 10 minutes worth of data". So if you run a minute simulation, you'd always have 10 events in the window. So we'd be counting bars instead of time. Maybe mavg(minutes=10, count_bars=True)?
Any form of feedback is greatly appreciated.