Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Python Lib for Research/Experimenting outside of Quantopian

Hi all,
I made some Quandl and Yahoo data tools that I've found pretty useful, so I cleaned up the code and threw it in a GitHub Repo.

The Portfolio Class is great for experimenting with DataFrame operations to use with the history() function in Quantopian. The DataManager works well for getting company fundamentals from quandl into csv files. It's off to a good start, check it out and if you like it, help make it better.

9 responses

Thanks - great work, David.

hi David, it looks interesting, but I'm wondering how it could be used with quantopian? maybe it can not? i'm trying to figure out how to best do research offline, so this will be useful anyway :)

Perhaps this can be added to Zipline / Quantopian approved list.

Perhaps the name Portfolio would be inappropriate as it is not a Portfolio in the investment sense. It could be names DataPortfolio, DataSource or something else perhaps.

I have been requesting proper DataProviders around different data sources from Quantopian than the fetcher API. This looks awesome.

Jason,
As of right now it most of it cannot be used directly with quantopian, however it could pretty easily be altered to return the url where it gets the data, and pass that to the fetcher. I would say that its usefulness is being able to easily experiment with various dataframe operations without having to run a backtest and log the info.

Suminda,
I would have to look at how Zipline wants their data formatted, I do know they have things for DataFrame sources, which is essentially what it relies on. Zipline already supports the yahoo data, so I am not positive if would be adding anything unique. The names are the simplest part of a program but for some reason can be tough decide on, I went with portfolio just because it can have multiple securities, I planned to add quantities too, but I might make it a DataSource and make the portfolio separate.

I like the sector/industry data portion, that data could be useful for normalizing by sector. I don't think it is available via Quandl yet, I sent them a pointer to the data source and they said they will work on it. I didn't figure out the url pattern there either, my guess is it's their database PKs with no real pattern.

As far as making the library more useful to the Quantopian community, I can go through and separate out all of the functions that generate the various urls. Maybe give each class a url attribute to pass to the fetcher, and pre_func methods that format the data. I think I like that idea.

I will look at altering the actual Quandl API so that it formats dataset urls and can be used with the fetcher. I think easy access to all Quandl data would be a great addition the Quantopian toolbox. Glad you guys are finding it useful.

I altered the Quandl api to fetch any dataset into Quantopian, see it in this post if you haven't yet.

One possible thing you can do is implement this like F# Type providers (I think this is doable in Python also and you can dynamically add fields to Pandas). Also pass in a publish date (Financial year end data is published at a later date than the year end itself) so you might not mistakenly introduce a look ahead bias.

Also perhaps get this white listed as a Quotopian import.

Look ahead bias could be a problem. For example, if an earnings report is released after hours on some day, the date will be the same as if it was released before the market opened. That is an important piece of the puzzle. I have looked around a little bit for an easily accessible data source with the dates and times of earnings reports etc. but have not had much luck. I'd like to add that if anybody knows of a free source for the info.