Upcoming changes to Quandl datasets in Pipeline (VIX, VXV, etc.)

Back to Community

edited Jul 5, 2016

On Thursday, July 7th, we'll be making some changes to how macroeconomic datasets are treated in pipeline.

Currently, pipelines that compute macroeconomic indicators, such as VIX, output the same value for each asset (sid) on each day. That means we're unnecessarily assigning a daily value to each asset as opposed to representing datasets such as VIX as its own time series.

So with these upcoming changes, data not associated with an asset (VIX, macro-economic indicators, etc) will be loaded as a single column of values. This means that when something like vix.close is used as an input to a custom factor, it is passed as a column vector rather than as a 2D array of values (dates x assets).

This is best illustrated with an example:

class VIXFactor(CustomFactor):  
    window_length = 3  
    inputs = [vix.close]

    def compute(self, today, assets, out, vix):  
        # Old vix: Each row contains `len(assets)` number of repeated values of  
        # vix for that day. There are 3 rows because our window length is 3.  
        # [[21, 21, 21, ..., 21, 21, 21],  
        #  [20, 20, 20, ..., 20, 20, 20],  
        #  [22, 22, 22, ..., 22, 22, 22]]

        # New vix: There are still 3 rows, but now there is always only 1  
        # column, which is independent of the number of assets.  
        # [[21],  
        #  [20],  
        #  [22]]

        # This will still work in the new case, as the singleton value [22]  
        # will simply be broadcast into `out`.  
        out[:] = vix[-1]

This new format is not only a more true depiction of the data, but it also lends itself nicely to computing things such as correlations and regressions between single columns of data and the columns of other factors (examples to follow soon).

So what does this mean for adding macroeconomic terms as pipeline columns?

Unfortunately, it means you can no longer add them directly to a pipeline through code such as pipe.add(vix.close.latest, 'vix'). In order to replicate this behavior, you're able to create a CustomFactor to achieve the same results.

class VIXFactor(CustomFactor):  
    window_length = 1  
    inputs = [vix.close]

    def compute(self, today, assets, out, vix):  
        # Here vix might look like [[20]], but this will broadcast the same  
        # value into `out` for every asset.  
        out[:] = vix

pipe.add(VIXFactor(), 'vix')

This also means that any custom factor using VIX will need to reflect the new structure. If you're unsure about how to support that, please provide your questions in this thread and I will be happy to assist with the migration.

For a list of all datasets to be affected by this change, please visit the available datasets from Quandl.

We welcome any feedback you might have on this new update and hope it will simplify working with macroeconomic datasets.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

6 responses

Simon Thornington

Jul 1, 2016

So, what is the transition plan for the backwards-compatibility-breaking change?

Timothy Bhattacharyya

Jul 1, 2016

@Seong

Will this update solve the problem that the current day's VIX and VXV are not available in Quantopian? Right now we can only access yesterday's closing VIX using Quandl.

Would love to be able to VIX and VXV realtime.

Seong Lee

Jul 1, 2016

@Simon we'll be working with individual live trading algorithm authors affected by this change and if you are unsure of compatibility please reach out to me at slee @ quantopian.com

@Timothy this will not change data availability but you have a good point and thank you for bringing it up.

Disclaimer

Jeff Liu

Jul 4, 2016

i echo what Timothy has said. it is important and critical to load intra day VIX

Seong Lee

Jul 5, 2016

The following post has been updated with the new method of loading VIX.

Please view: https://www.quantopian.com/posts/javols-just-another-volatility-strategy-dot-dot-dot

Also, if you are currently loading in VIX through a CustomFactor like so:

# Calculate the impact of the term structure (see more about the 'impact' at the link above). Basically,  
# isolate how much of the movement of VXX isn't due to the VIX index. As a handwavy model we'll  
# run a simple regression of the changes in the VIX to the changes in VXX, and use the  
# intercept term as an estimation of the 'impact.'  
class TermStructureImpact(CustomFactor):  
    # Pre-declare inputs and window_length  
    inputs = [yahoo_index_vix.close, USEquityPricing.close]  
    window_length = 20  
    def compute(self, today, assets, out, vix, close):  
        # Get the prices series of just VXX and calculate its daily returns.  
        vxx_returns = pd.DataFrame(close, columns=assets)[sid(38054)].pct_change()[1:]  
        # Since there isn't a ticker for the raw VIX Pipeline feeds us the value of the  
        # VIX for each day in the 'window_length' for each asset. Which kind of makes sense  
        # -- the VIX is the same value for every security.  
        # Since I have a fixed universe I'll just use VXX, one of my securities, to get a single series of  
        # VIX data. You could use any security or integer index to any column, but I'll use one of my  
        # securities just to keep things straight in my head.  
        vix_returns = pd.DataFrame(vix, columns=assets)[sid(38054)].pct_change()[1:]  
        # Calculate the 'impact.'  
        alpha = _intercept(vix_returns, vxx_returns)  
        out[:] = alpha * np.ones(len(assets))

The correct way of doing so will be:

# Calculate the impact of the term structure (see more about the 'impact' at the link above). Basically,  
# isolate how much of the movement of VXX isn't due to the VIX index. As a handwavy model we'll  
# run a simple regression of the changes in the VIX to the changes in VXX, and use the  
# intercept term as an estimation of the 'impact.'  
class TermStructureImpact(CustomFactor):  
    # Pre-declare inputs and window_length  
    inputs = [yahoo_index_vix.close, USEquityPricing.close]  
    window_length = 20  
    def compute(self, today, assets, out, vix, close):  
        # Get the prices series of just VXX and calculate its daily returns.  
        vxx_returns = pd.DataFrame(close, columns=assets)[sid(38054)].pct_change()[1:]  
        # Since there isn't a ticker for the raw VIX Pipeline feeds us the value of the  
        # VIX for each day in the 'window_length' for each asset. Which kind of makes sense  
        # -- the VIX is the same value for every security.  
        # Since I have a fixed universe I'll just use VXX, one of my securities, to get a single series of  
        # VIX data. You could use any security or integer index to any column, but I'll use one of my  
        # securities just to keep things straight in my head.  
        vix_returns = pd.DataFrame(vix).pct_change()[0].iloc[1:]  
        # Calculate the 'impact.'  
        alpha = _intercept(vix_returns, vxx_returns)  
        out[:] = alpha

We currently plan to roll this change out as of Thursday, July 7th EST. Please be aware and email me at [email protected] if you need help migrating your algorithms to the new change.

Disclaimer

Burrito Dan

Jul 11, 2016

Something that threw me is that while the VIX (or other macroeconomic variable) appears in compute() as a single column, it is still broadcast to multiple columns in the output of pipeline, one per asset. This means you still need to select an arbitrary column in your trade code.

output = pipeline_output('example')  
vix = output["vix"].iloc[0]

In the end, I didn't need to change my code.

You've successfully submitted a support ticket.

Our support team will be in touch soon.