I am confused by the specification for the prices
parameter of the alphalens get_clean_factor_and_forward_returns()
function. If my prices
parameter only has prices for dates 11-1, 11-2, 11-5, and 11-6, and I wish to analyze my factor for 11-1 at the 3rd forward return period (ie periods=[3]
), will it basically compute ((11-6 - 11-1) / 11-1)
as the forward return? Do I need any additional dates in the prices
parameter?
The docs say I need "an additional buffer window that is greater than the maximum number of expected periods in the forward returns calculations." Why "greater than" and not "equal to"? How much greater than?
I'm also curious if the prices
parameter may contain data for assets that are not part of my 11-1 alpha factor?
This is the full language used in the docs:
Pricing data must span the factor analysis time period plus an additional buffer window that is greater than the maximum number of expected periods in the forward returns calculations. It is important to pass the correct pricing data in depending on what time of period your signal was generated so to avoid lookahead bias, or delayed calculations.
'Prices' must contain at least an entry for each timestamp/asset combination in 'factor'. This entry should reflect the buy price for the assets and usually it is the next available price after the factor is computed but it can also be a later price if the factor is meant to be traded later (e.g. if the factor is computed at market open but traded 1 hour after market open the price information should be 1 hour after market open).
'Prices' must also contain entries for timestamps following each timestamp/asset combination in 'factor', as many more timestamps as the maximum value in 'periods'. The asset price after 'period' timestamps will be considered the sell price for that asset when computing 'period' forward returns.