Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Errors in fundamental data

In using get_fundamentals to pull quarterly data for 10/1/2014 and 12/31/2014, I came across what appear to be data errors in Accounts_Receivable (A/R) and Total_Revenue.

Equity(44432 [KNOP]) A/R of 0 in 10/1/2014
Equity(45557 [RMAX]) A/R 1000x smaller at 10/1/2014 than at 12/31/2014
Equity(7457 [TJX]) Revenue 1000x higher in 10/1/2014 than in 12/31/2014
Equity(7904 [VAR]) A/R 1000x smaller at 10/1/2014 than at 12/31/2014
Equity(16323 [NSP]) A/R 100x smaller at 10/1/2014 than at 12/31/2014

Is there a place (other than the discussion forums where I should send suspected data errors like these? Or should I just exclude companies with problem data from my universe? How are other Quantopians dealing with fundamental data problems?

5 responses

Tom,

I dug into some of these (though not all).

For KNOP, the A/R of 0 looks to be legitimate. There is nothing amiss in the raw data as best I can tell and we are portraying the reports from Morningstar properly. Looking at Google Finance as a secondary source looks to confirm this. https://www.google.com/finance?q=NYSE%3AKNOP&fstype=ii&ei=3hgcVomJCsvIec3anJAM (click on the Balance Sheet tab).

Similarly, for RMAX, this simply looks to be what the data from Morningstar is telling us.

I do not know enough about A/R accounting practices to tell you whether these behaviors seem reasonable or not. If you think these data points are truly in error, let me know and I can notify Morningstar to make a correction in their data.

For TJX, the revenue number does indeed come in at 6.9e+12. For two days, Morningstar kept it at that erroneous level and then corrected it 2 days later. So two problems here: 1) Morningstar generated bad data. It happens and you should expect it to happen on occasion and likely defend against it in your algos. 2) We don't seem to be processing Morningstar's subsequent correction. We'll have to dig into why. I've submitted a ticket for our engineers to dig in further on this particular non-update issue.

Thanks so much,
Josh

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi Josh - hopefully you won't be back-correcting it, right?

There are two problems on the TJX revenue number.

Problem 1: Morningstar gave us a bad number.
And then Morningstar corrected it two days later.
Problem 2: Quantopian for some reason didn't process the correction.

We will not fix Problem 1. That's a fact of life. Bad numbers come out from Morningstar. We should show the fact that Morningstar published a bad number and that it existed as Morningstar's best number for 2 days. Consequently, your algos should defend against unlikely scenarios like revenue in the trillions of dollars.

We will likely fix Problem 2. If this particular problem will stop happening moving forward, you should have data to backtest that doesn't exhibit the behavior as well. I can see the opposing argument so I'm open to disagreement, but that's where I stand right now.

Hope that helps,
Josh

Thanks, yeah that's fine, as long as it stays bad in perpetuity when asking for it on one of those two days!

So how do we protect ourselves from Morningstar bad data? If I see a number change but I suspect it should only change on announcements (such as revenue, but not ratios), should I flag it as suspect? Any other ideas?