Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
VWAP - Are there any plans to fix this?

It has been pointed out to me that the custom factor for VWAP:

from quantopian.pipeline.factors import VWAP is currently being computed incorrectly. It appears that the VWAP custom factor computes VWAP on daily data using daily close and volume prices. Every other defintion of VWAP (forgive this unscholarly source) I have seen uses tick data, it can also be estimated using minute bars. There was a discussion in another thread on how to estimate VWAP using minute bars. Yes it is arguable that VWAP can be computed using any time scale, however it is misleading for a daily VWAP to be computed using other daily data.

This discussion about what is the right way to compute VWAP is not productive, however myself and many other memebers of the Quantopian community would find a daily VWAP computed from minute or tick bars very useful.

Are there any plans to generate a VWAP based on minute bars?

5 responses

Hi Peter,

The current implementation of VWAP uses daily data because pipeline is currently integrated with daily data only. In the future, we would like to be able to incorporate minute-level data in pipeline, but until that happens, we won't be able to create built-in factors that use minute data. The best alternative right now is to make your own VWAP computations outside of pipeline using minute data from data.history.

Sorry for the inconvenience.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi Jamie,
Thanks for the quick reply. I understand about the limitations of Pipeline.
What would be wrong with computing VWAP once and then attaching it as an external source along the lines of the morningstar data perhaps like:

from quantopian.pipeline.data import data_from_daily_bars.

Does that sound silly?

Hi Jamie,

Regarding "The current implementation of VWAP uses daily data because pipeline is currently integrated with daily data only" if I understand the discussion around VWAP, the implementation would still be daily, since for each stock a daily VWAP value would be computed, using your minute bar database. So, you would not be incorporating minute-level data into pipeline directly, but rather daily data derived from minute bars (which in effect is what you do to create the daily bar database in the first place). Are you misunderstanding the point here, since you say "we won't be able to create built-in factors that use minute data"?

Computing daily VWAP from minute bars seems tractable, since you'd set up the database from 2002 to present, and then every night/weekend, update it. Of course, as errors in the minute bar data are found (e.g. missing splits), you'd have to make corrections to the database, but the corrections would be on individual stocks, not the entire database. So the mechanics of setting up and maintaining the database don't seem so bad, but maybe I'm missing something?

Intuitively, it does seem like there would be a great advantage in the conceived workflow to use daily prices that are smoothed in some fashion (e.g. VWAP), versus values from individual trades (e.g. the last trade of the day). Although apparently there are aspirations "to incorporate minute-level data in pipeline" there was a discussion when pipeline was launched that computationally, it would be impossible on the present platform (see Scott Sanderson's comments). Do you actually have ideas on how to do it? If so, when?

As I understand it, Quantopian already "computes" all the fields of each daily bar, i.e., OHLCV, by looping over historical 1-minute bars. This is an excellent design choice which results in these OHLCV daily price fields being more realistic relative to actual trading possibilities than traditional daily-only data sources are.

The best way to add robust support for VWAP would be to add it as a native field to each daily bar in the database. Given that Quantopian already constructs their daily bars from intraday data, this would add virtually zero overhead to that calculation process, and would make VWAP easily accessible in a daily pipeline.

Having a native VWAP price field would also give Quantopian another unique advantage vs. competing services. Please consider taking this approach.

VWAP would seem to be a good starting point (and would seem to be trivial if, indeed, the daily OHLCV bars are computed from the minute bars). Any daily statistic computed solely on a stock-by-stock basis would seem to be straightforward. The best approach would be to put users in the drivers seat, with an API that would allow computation of custom daily statistics derived from minute bars (which could include incorporating minute bar data from a trailing window of N days).