Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Creating a dataframe indexed by date

Hi there. I'm trying to create a dataframe for a security that has a series of datetimes as its index, then in each subsequent column various price data for the security at each given datetime - e.g. close, volume, etc.

I'm new to Quantopian and poked around data.history a little bit, but am still unsure how to deal with passing fixed datetimes and datetime ranges in an algorithm. Ideally I'd be able to do something like the following:

for tick in tick_list:
df = pandas_datareader.data.DataReader(tick, 'yahoo', start, end)

If anybody knows a way to do this in Quantopian, please let me know!

3 responses

Need a little bit of clarification here... Are you looking to import your own data or use the built in Q data?

If you use the built in Q data then the 'history' method is what you would use (see https://www.quantopian.com/help#api-data-history ). The rows (the major index) will be times (either daily or minutely depending upon the chosen frequency) and the columns will be the various price and/or volume data at that time depending upon which fields one choses. This will be either a 2D pandas dataframe if a single security is fetched or a 3D pandas panel if multiple securities are fetched (or potentially a series if a single security and single field are fetched).

The timestamps in the index are all absolute so one can manipulate, search, and slice by fixed datetimes as desired. However, the length (the number of rows or time units) to look back is fixed and relative to the current algorithm time. One can't directly specify 'fetch data for 1-2-2016' for instance. This is entirely to prevent look ahead bias. One can never view data from the future in an algorithm.

One can indirectly select a specific date range by calculating the number of days (or minutes) the desired date is from the current algo time. Then, simply use that 'delta' as the 'bar_count' in the history method. Not sure what a use case for this would be?

Does this help? I could give some examples if you would like specifics.

Using the built in data. Preventing look ahead bias makes sense. In the IDE I developed it in originally, my algo used calculations based on data from a specified date range prior to trading, largely for reasons of computational efficiency - performing the calculations once, then storing that data and using it to compare stats for different stocks was much faster than recomputing for each day of trading in the backtest. Computation time doesn't seem like it'll be a problem in Quantopian though, and recomputing for each day, adding in the most recent available data is the ultimate goal of the algo.

Thanks for your help!

I'd start with re-running the calculations every day in the 'before_trading_start' method If that works for your situation. Don't worry about time initially. That method gets allotted 5 minutes which should generally (though not always) be enough to do what one wants. Maybe then try to move it to a scheduled function. Those get allotted only a minute but if it works there then it's easy to schedule to only run once a week or month or something.

One word of caution... You mentioned "adding in the most recent available data" This may have just been the wording, but don't ever save price or volume data from day to day. Any stock splits or dividends will throw off the values a lot. These events happen more than one would think.

Good luck.