I'm trying to reduce the effects of noise on the edges of the time series look-back window when running linear recursions. The "noise" I'm specifically worried about are dips and pops where, for whatever reason, a stock will temporarily have an dramatic idiosyncratic movement upwards or downwards before mean reverting.
Put another way, I think we've all experienced looking at a 3-month chart of a stock and thinking to ourselves "this has been performing pretty well!" and then look at the 1yr chart and think "no, actually this hasn't been performing well at all." Where you cut off the returns window can give wildly different impressions of a stock's performance. What I want to do is de-emphasis the data near the edges of the window, to help eliminate the effects of this "framing bias."
Also, unlike other datasets you might run a linear recursion on where values are independent of each other, for stock market returns I believe the order of the values holds significance, and so truncating the data means you've removed something meaningful to the remaining data that it was nearest to.
Lets say for the sake of simplicity that my look-back window is 20 and my daily gains are: -0.0909, 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
So it's basically just flat, with a symmetrical "dip" at the front of the dataset. (Keep in mind these values are daily % gain, which is what you use to calculate alpha and beta.)
On the first day my data would look like this: -0.0909, 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
So it's basically flat returns.
On the second day my data would look like this: 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
Suddenly it's significant positive returns.
On the third day, my data would look like this: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
So again, flat returns.
So what happened was a dip at the edge of the look-back window caused a significantly changed picture of the over-all data through time. When calculating alpha and beta for the stock, the values are going to jump around quite a bit, not due to any change in performance of the stock, rather due to the arbitrary length of my look-back window and depending on what kinds of movements are getting clipped.
So to solve this, I want to multiply the values of my time series by a series of values produced by the window function you suggested. This will (I think) de-emphasize the truncated movements at the edges of the look-back window, thus creating more useful alpha and beta results. The results will be smoother through time as well, which will help reduce spurious edge-noise-induced rebalancing.
Lets say I use a triangle window function that returns the following values: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.0, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1
.
When applied to the first day in the example above, I would now have 0.009, 0.002, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
Slightly positive, but overall-flat.
When applied to the second day in the example above, I would now have 0.01, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
Eessentially flat.
The alpha calculation on this ramped dataset will produce a value that from day to day moves smoothly instead of wildly jumping around.
Basically the trouble I'm having is I just don't know the correct way to multiply the array produced by the window function (signal.kaiser( self.window_length, 3 )
) with the asset_returns pandas dataframe (pd.DataFrame(close, columns=assets).pct_change()[1:]
)