@Karol,
Thanks.
algo only has price information from the date range you specify in get_pricing()
I gather that this means that the entire data set is in RAM? Is this the same as the custom IDE/GUI backtester, or does it read data from a database on disk on an as-needed basis? Getting everything into RAM sounds like the way to go, if it'll fit. But then RAM costs money, so I'm figuring there are some practical limitations for Quantopian. Looking over http://aws.amazon.com/ec2/instance-types/ (which I think Quantopian uses...can anyone confirm?), 244 GB is the upper limit.
I've noted the potential limitation for the research platform on https://www.quantopian.com/posts/quantopian-research-platform-comments-and-questions. If the entire data set needs to be loaded into RAM, but RAM is limited, then at some point there will be a problem. And since users can't see available RAM and usage, it could be a very unpleasant user experience when scaling up. Has anyone tried loading minute bar data for 5,000 securities over 10 years on the research platform, for example?
@Lucas,
There are so many software engineers here that if Quantopian made it easier to contribute I am sure many would help.
Quantopian tends to be cagey about matters beyond what has been open-sourced on zipline. This is understandable, since if they open up everything, their IP would just get copied. It would be truly revolutionary if they opened up their IP and day-to-day development process to the crowd. My hunch is that they'll continue with their current model of open-sourced zipline and closed-source everything else.
There's also the business side of things, which is a consideration here. They would have to open up about the constraints on various approaches due to cost, which vendors are being considered, etc. And they probably have NDA's, terms, etc. with vendors, and the lawyers wouldn't know what to do with it all ("Huh? You want to 'open-source' the business? No more lawyers? Request denied!"). Again, it would be truly revolutionary to open it all up to the crowd, but my read is that there is a part of the Quantopian operation that won't be open-sourced.
The idea of an integration server / test instance is interesting. I suppose it would require one for the research platform and one for the backtester. Just a matter of dollars and resources. They'll eventually get there, I figure. At the scale of $10B capital, with institutional investors, they'll need a whole QC apparatus. No more testing changes on production systems. It would be nice, though, to be able to pull custom code into the research platform. Then zipline variants could be tried there on actual Quantopian data sets (presumably, Quantopian engineers can do something along these lines already).