Hi Michael,
You're on the right track. In fact, you got the syntax right and you're thinking about the shape of the data, which is good. When you pass a factor as input to a CustomFactor
with a window_length = n
, the factor value as it would have been computed over the last n
days is populated into the MxN array. Essentially, the CustomFactor
knows how to turn the input term into the MxN array that you need.
However, this is only true for pipeline terms that are 'window safe'. A pipeline term that is 'window safe' is a term that is robust to pricing adjustments from splits or dividends - a value that will be the same no matter what day you are looking back from. This is true for normalized values such as returns. SimpleMovingAverage
is not a window_safe
factor because the result can change depending on whether you are applying an adjustment or not. This is important because the MxN matrix is generated using data as it would have been seen on each day in the lookback window.
As an example, say we have a stock XXX with the following unadjusted price history:
$5, $5, $1, $1
and let's say that on the third price, the stock dropped to $1 because of a 1-to-5 split. If we do a lookback after the 1st day of the data, the adjusted history would look like this:
$5
after the 2nd day it would look like this:
$5, $5
after the 3rd day (1-to-5 split occurred) it would look like this:
$1, $1, $1
after the 4th day:
$1, $1, $1, $1
If I ask for the rolling 2-day SMA, you can see how it would change based on the day i'm asking from, because the split was applied later on. This is what the 2day SMA would look like as an input to a CustomFactor
:
$5, $5, $1, $1
Note how it looks like there is a lot of volatility in the price, when in reality, the adjusted price didn't change at all.
However, returns are normalized. This is what the 2-day returns would look like as computed each day:
$0, $0, $0, $0 regardless of what day i'm asking from.
Does this makes sense?
I'm wondering, what are you looking to use the Diff_StdDev
computation for? I'm wondering if there's a better way to compute a similar statistic that is 'window safe'.