Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Historical Data

Hi everybody!

Some questions from a newbie:

1)On what kind of historical data I can backtest my algo? Do you have historical tick per tick resolution?

2) Can I have this database of data in some way? For example, as of now, I'm developing a trading system in Java, is there some way to use your database of data with my trading system?

3)Isn't back-testing always biased someway? I mean, how can you know for sure, that with your presence on a book of a certain security, the other participants on the market would have behaved like they did? Let's say that for AAPL book, with the huge volumes that there are, with 100000 USD on the bid or on the ask the marked would have behaved the same, but still it's not 100% sure.

Thanks

4 responses

Welcome Emiliano! To answer your questions,

  1. You can backtest your strategy on historical trade data since 2002 for US stocks and ETFs. We don't have tick-by-tick details, the lowest frequency in minutely. We're not an HFT platform; you can either run a backtest in minute or daily mode. For more information about our data sources take a look at the FAQ and help doc.

  2. Per our contract with our vendor, we are not allowed to redistribute the data. You are more then welcome to develop your strategy in the IDE and tweak your code to see different returns, positions, and balances. But you unfortunately cannot download the database, logs, or backtest results.

  3. Backtesting is easily subject to bias, and we've taken great care to prevent it. First, data is introduced to your algorithm only after it becomes available to prevent look-ahead bias. For example, when you call on the current price, we pass the price from the previous bar since you can't know the market price in that instant. If you're using Fetcher, data is fed into the algorithm when the backtest reaches the assigned date, instead of exposing the data at the start of the backtest. Second, our historical database includes all stocks that have traded since 2002, including ones that are no longer trading (LEH) to prevent survivorship-bias. And finally, third, if you're worried about cherry-picking stocks, you can use the set_universe function to create a basket of stocks based on their dollar volume trading. This creates up to 2% universe in minute mode (~160 stocks) and 10% universe in daily mode (~800 stocks). This method prevents hindsight bias.

Backtesting is a way to quickly see how a strategy would have performed in the past. It's an easy way to discard poor-performing ideas to develop more promising algorithms!

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thank you Alisa for your reply!

I am just a bit unhappy with your answer to point 3. I think you didn't get exactly what I wanted to say. Basically, what I wanted to say is that back-testing is always "biased", in the sense that you can't (or at least I don't know how) know if other participants in the market would have reacted in that way with your presence in the market. I mean in backtasting you say: I bought these shares or sold these shares at this price, but you can't know if, for example, by being on the market, on the book of that particular stock, the other bids or other asks would have changed their prices or sizes.
It's highly probable that you would have bought or sold at that price, but with large capitals, let's say 10 million dollars, you can't know for sure.
I hope you understand what I wanted to say.

ah, now I understand the question. Let my try that answer again :)

You can use slippage to model the impact of your trades on the market. By default, you can transact up to 25% of the stock's volume in that bar. You can easily tweak this parameter to better fit your strategy, or turn it off entirely. If you don't like either of our two built-in slippage functions, you can write your own custom-built function to better simulate the market impact of your orders. Here's an example.

To further model real-world conditions, you can set the commission to incorporate this transaction into your algo returns. If you don't specify a commission, your backtest defaults to $0.03 per share. You can always update this parameter to reflect your broker settings.

Hope that better answers the question!

You can use slippage to model the impact of your trades on the market. By default, you can transact up to 25% of the stock's volume in that bar. You can easily tweak this parameter to better fit your strategy, or turn it off entirely. If you don't like either of our two built-in slippage functions, you can write your own custom-built function to better simulate the market impact of your orders. Here's an example.

This is exactly what I wanted to hear and to know!
Perfect answer! Thanks