Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Seemingly Corrupted Stock Data

I am kind of new to this, so perhaps I am just missing something fairly obvious, but I have developed an algorithm that works well enough for now. There isn't too much to it, so I don't feel like anything is happening that I don't understand, but when I am doing my Backtesting, I encounter these stocks that make the algo do absolutely crazy things like buy just insane amounts of this stock (making my cash go negative) and pushing my returns to positive and negative thousands (sometimes tens of thousands of percents.) My algo works for nearly all of the stocks that get put into it, buy/selling appropriately, but when it encounters these stocks, the entire thing gets crazy. The only thing that I can think of is that, somehow, these troublesome stocks' data is corrupted in some way or another so when they come up, they do all this unexpected business. Is this a problem that anyone else has encountered? And, if so, how did you manage your way around it?

I have a large filter in pipeline to get rid of many weird things (as demonstrated in the tutorial) and I use can_trade, and once I go through and pick all of them out by hand (Equity != sid(40009)), things run smoothly, but there just seems to be sooo many of these.

Any help would be greatly appreciated!

6 responses

Brady -

There are sometimes data errors. I recommend reporting them, as I did on https://www.quantopian.com/posts/missing-split-ppg . Unfortunately, Quantopian does not maintain a list that is user-facing.

However, it could be that your algo is responding whackily to certain stocks, which is a different story. If your code says, effectively, buy XYZ at 1000X leverage, if certain conditions are met, then you will see crazy behavior, but it would not necessarily mean that the XYZ data are corrupt.

Thanks, Grant.

I don't think I have any kind of manipulation like that going on. I have the outline of my buy function below. Really all it does is act a way to further filter my pipeline results, and, if a stock passes through, it just gets bought at the end. Seems simple enough, and it works correctly for 95% of the stocks that it encounters.

if cash > 0:
for Equity in context.output.index:

        if cash > 0   #since we will be buying below then circling back up  
             if (data.can_trade(Equity)) & (cash > data.current(Equity, 'price')) :  
                 if Equity not in context.portfolio.positions:  
                     amount = cash / data.current(Equity, 'price')  

                     if data.current(Equity, 'price') < (context.output.latest_close_price[Equity] * .98):  
                         stopPrice = data.current(Equity, 'price') * .98  
                         order_target_percent(Equity, 0.25, style=StopOrder(stopPrice))  

                         cash -= (amount*data.current(Equity, 'price'))

This all seems simple enough, I'm not quite certain where it is going so drastically wrong.

You might try adding:

record(leverage = context.account.leverage)

Your leverage should stay within the bounds of your strategy.

Also, what is your list of sids that are giving you trouble, and around what dates? Usually the problem is missing splits, which is fairly straightforward to ferret out (using https://www.splithistory.com/ ).

I will give the leverage thing a shot.

I have had problems with so many stocks. Below is a list of all the ones that have causes problems (always the same problem), though I am certain that is not all of them.

I provided the list below of ones that I have picked out.
A specific one with a date is GRAN: sid(3329), on 2011-05-03.

If I start with $1000, it buys the appropriate amount ($978.63), but then on the same day it sells $6154.50 even though the stock hasn't gone up nearly that much--it just over sells. Then it spends the next two days only buying this stock (what I assume is just trying to make up for the oversell on the first day.) What I just can't figure out is what it is about these certain stocks. It works so well until it runs into one of these ones and then it does the craziest things. For a while, I thought that perhaps it might have had something to do with volume--but with some testing and experimentation, that doesn't quite seem to be it. And like I said, when these are all picked out over a certain time-span, things move smoothly and all the other stocks behave as intended.

& ((Equity != sid(26384)) & (Equity != sid(12008)) & (Equity != sid(42844)) & (Equity != sid(13577)) & (Equity != sid(26673)) & (Equity != sid(33078)) & (Equity != sid(15988)) & (Equity != sid(20187)) & (Equity != sid(22315)) & (Equity != sid(39173)) & (Equity != sid(6362)) & (Equity != sid(14291)) & (Equity != sid(29115)) & (Equity != sid(8413)) & (Equity != sid(21145)) & (Equity != sid(12190)) & (Equity != sid(16352)) & (Equity != sid(33463)) & (Equity != sid(36966)) & (Equity != sid(39180)) & (Equity != sid(20776)) & (Equity != sid(24657)) & (Equity != sid(3945)) & (Equity != sid(30851)) & (Equity != sid(45080)) & (Equity != sid(28559)) & (Equity != sid(40009)) & (Equity != sid(39522)) & (Equity != sid(16192)) & (Equity != sid(35088)) & (Equity != sid(30739)) & (Equity != sid(16585)) & (Equity != sid(44416)) & (Equity != sid(13867)) & (Equity!= sid(11059)) & (Equity != sid(27911)) & (Equity != sid(28788)) & (Equity != sid(40117)) & (Equity != sid(30504)) & (Equity != sid(25776)) & (Equity != sid(2391)) & (Equity != sid(21842)) & (Equity != sid(9621)) & (Equity != sid(32624)) & (Equity != sid(36189))

I have figured out the problem. I failed to account for open orders, so that when I was checking things like stop or limit orders, things tended to get doubled up, which was compounded further if things were still open when I hit a re-balance. When these perfect storms would hit, I would be all over the place, and it would take a couple days, sometimes week of buying and selling back and forth before things would stabilize.

Once i added

if get_open_orders(Security) == []:

in front of those places, things smoothed out.

You should be able to use:

if not get_open_orders(security):  

This might be the better way to code it, per some examples I see on the help page.

Also, note that automatically, all open orders will be canceled at the end of each trading day. There is no way (that I know of) to override this behavior (other than re-entering the orders the next day).