Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
How to get daily volumes

Hi, pretty new to this so apologies if I'm making basic mistakes...

I'm trying to get daily trade volumes. I initially tried a pipeline with this column:

USEquityPricing.volume.latest

... but the results seemed too small e.g.

  • Apple 06/15/2018 = 15.7M
  • Yahoo finance reported 61.2M

Is the time period smaller than a day? I can't tell from the docs.

I then tried this:

from quantopian.research import volumes, symbols

aapl_close = volumes(
assets=symbols('AAPL'),
start='2018-06-10',
end='2018-06-15',
)

import pandas as pd
pd.DataFrame({
'AAPL': aapl_close
}).tail()

-- the docs say this should be a day by default but this gives 28.4M

Confused!

6 responses

My quick research prove substantial difference Quantopian's volume data not only with Yahoo history but between two different ways you take it.

def initialize(context):  
    schedule_function(record_volume, date_rules.every_day(), time_rules.market_close())

def record_volume(context,data):  
    stock = symbol('AAPL')  
    v  = data.current(stock, 'volume')  
    V1 = data.history(stock, 'volume', 5, '1d').iloc[-1]  
    print (V1, v)

Log output

2018-06-01 12:59 PRINT (18278742.0, 320402L)
2018-06-04 12:59 PRINT (21159162.0, 93367L)
2018-06-05 12:59 PRINT (16661532.0, 206261L)
2018-06-06 12:59 PRINT (14113150.0, 115111L)
2018-06-07 12:59 PRINT (14982271.0, 118309L)
2018-06-08 12:59 PRINT (19403342.0, 170845L)
2018-06-11 12:59 PRINT (14360539.0, 94633L)
2018-06-12 12:59 PRINT (12564678.0, 145482L)
2018-06-13 12:59 PRINT (14751124.0, 209466L)
2018-06-14 12:59 PRINT (15145759.0, 212633L)
2018-06-15 12:59 PRINT (27315535.0, 263323L)

Yahoo Quote

https://finance.yahoo.com/quote/AAPL/?p=AAPL

At close: June 15 4:00PM EDT Volume 33,104,035

Yahoo history

https://finance.yahoo.com/quote/AAPL/history?p=AAPL

Date Volume
2018-06-01 23,250,400
2018-06-04 26,132,000
2018-06-05 21,566,000
2018-06-06 20,933,600
2018-06-07 21,347,200
2018-06-08 26,656,800
2018-06-11 18,308,500
2018-06-12 16,911,100
2018-06-13 21,638,400
2018-06-14 21,610,100
2018-06-15 61,289,600

WHY?

Thanks Vladimir, glad it's not just me.

I noticed that your number: 2018-06-15 12:59 PRINT (27315535.0, 263323L) is nearly the same as the 28.4M that my method returned - a small discrepancy like this is not so bad (maybe due to after hours trading?)

Does anyone know what's happening? My strategy relies on accurate trade volumes so I'm kind of stuck...

Shameless thread bump. Is anyone from Quantopian looking at this?

Hi Nick,

Sorry for the slow reply.

Regarding your original question, the mismatch between Pipeline and quantopian.research.volumes outputs for volume is explained by the different interpretations of the date index of each output.

For quantopian.research.volumes, the date index of the output is the as_of_date of the volume reported, or the date that the value was 'seen'. For example, volumes reports that on 06/15/2018 there was a total volume of 28476218.0 for AAPL by the end of the trading day (see the attached notebook).

On the other hand, the date index of a Pipeline output corresponds to the date in which the pipeline is computed (for a range of dates, the pipeline is computed each day). Pipelines are designed to execute before the start of a trading day, so the latest known value for any column you use in a pipeline will be the value reported at the end of the previous trading day. For example, on 06/15/2018 USEquityPricing.volume.latest will give you the volume corresponding to 06/14/2018 (see the attached notebook).

Vladimir, since you are querying day frequency volume ('1d') with data.history, the last value in the output (regardless of the number of bars requested) corresponds to the partial daily volume of the current day (from start of the trading day up to the current minute bar). On the other hand, the output of data.current gives you the volume of the current minute bar only. This is why so see a much larger volume out of data.history for the last value.

Regarding the difference between Quantopian's volume data and other sources, it largely depends on how those other sources collect their data, and what time intervals are covered. Quantopian's pricing and volume data is sourced from electronic trades, and only covers regular trading hours (9:30am - 4:00pm ET). Sources like Google and Yahoo often include after-hours trades in their reported volume, which would account for some of the difference you are seeing.

I hope this helps.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Sorry, I forgot to attach the notebook.

Hi Ernesto, thank you for the reply, that was helpful in understanding the code.

I tried a few more example stocks (IBM, XRX, MSFT) and the volumes generally seem a lot lower than other sources (usually around half) which is curious...