Below are just some thoughts, and a question regarding this strategy after reading Vidyamurthy's book.
One key component to building a 'Factor model' is obviously the selection of factors, which I believe Vidyamurthy intentionally left out from the book.
Ideally the factors shall be able to explain the price/return of the asset. For pairs trading, we will be looking at the price of the asset. One way to determine how well the factors explain the asset price is the goodness of fit from linear regression. With linear regression, we can obtain the r^2, where a high r^2 would indicate that the factor explain the price of the asset well within the given time frame. However, high r^2 within one time frame does not guarantee the r^2 value would persist in the following time frame as indicated in the figure below.
Similarly, for picking pairs, Vidyamurthy proposed to pick pairs with high 'distance measure'. This distance measure captures the similarity of the factor exposure to the 2 assets. Again, a high distance measure obtained at a given time frame may not persist through the following time frame, as indicated in the figure below.
(red indicates pairs have r^2>.9 in the earlier time frame.)
The optimal pairs are likely those with high distance measure and high r^2 (from the linear regression of factors and asset) in both time frames...
Any frameworks/ideas/hints for picking a good set of assets/factors that would persist through time?
Factors used: YAHOO/INDEX_HUI,YAHOO/INDEX_VIX,YAHOO/INDEX_OSX,YAHOO/INDEX_XAU, YAHOO/TSX_RTRE_TO,YAHOO/INDEX_WILREIT,YAHOO/INDEX_SML,YAHOO/INDEX_N225,YAHOO/INDEX_W5KLCV,YAHOO/INDEX_HCX
Assets used: components of Dow Jones