Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Access to fundamental data from previous quarters/years

Hello,

I'm implementing a backtest and I'm trying to access previous quarters/years financial data so that I can measure YoY growth. Ex: Compare the most recent quarter's EPS to the same EPS 1 year ago. Compare last quarter's EPS to the quarter same quarter a year previous.

I'm able to get the most recent quarter EPS data by using: morningstar.earnings_report.diluted_eps.latest

I found this post and it looks like it isn't supported yet?
https://www.quantopian.com/posts/period-ending-date-and-historical-fundamental-data-quarterly-slash-yearly

I appreciate any help! I've been stuck on this for a few days...

Thanks,
--Brian

19 responses

First off you will want to use the pipeline to fetch and access fundamental data. Check out the tutorials if you are not familiar. You may also want to look at these posts:
https://www.quantopian.com/posts/fundamental-historical-data-1
https://www.quantopian.com/posts/historical-fundamental-data-1

To get the most current fundamental data simply use the '.latest' method of any field (ie bound column) as shown below.
To get fundamental data as of x trading days ago (eg 252 days or 1 calendar year ago) you will need a small custom factor as shown below.

from quantopian.pipeline import CustomFactor  
from quantopian.pipeline.data import morningstar

class Previous(CustomFactor):  
    # Returns value of input x trading days ago where x is the window_length  
    # Both the inputs and window_length must be specified as there are no defaults

    def compute(self, today, assets, out, inputs):  
        out[:] = inputs[0]

eps_today = morningstar.earnings_report.diluted_eps.latest  
eps_1_year_ago = Previous(inputs = [morningstar.earnings_report.diluted_eps], window_length = 252)  
eps_last_qtr = Previous(inputs = [morningstar.earnings_report.diluted_eps], window_length = 63)

Hope that helps.

Thanks so much Dan! That helps enormously.

Can you assume that each quarter is 63 trading days? So if I want to walk back quarters, -63 days, -63 days [...]?

Again thank you so much for this!

Brian,

No the quarters aren't exactly all 63 because the holidays aren't distributed evenly. Maybe look at https://en.wikipedia.org/wiki/Trading_day. That said, a bigger problem with comparing fundamental data is that it's reported by a company up to 45 days after a quarter ends. So, for instance, if one were to look at data on Jan 10, 2017 it may not be for the quarter ending Dec 31 (because that data hadn't been updated yet). Moreover, some of the companies will have filed already and some won't. So some companies will have current data and others not. You may be comparing apples to oranges.

One way to mitigate this is to time your algorithm to not look at fundamental data before 45 days after the end of each quarter (so a blackout period). In the long run it may not impact your algorithm if you don't do this but it's something to think about. A more 'precise' method would be to check the filing dates in the fields below.

There are two interesting fields in the morningstar dataset financial_statement_filing.file_date and financial_statement_filing.period_ending_date. Add those as columns to some fundamental pipeline output in a notebook. Run the notebook for a single stock over a years time. You can get a feel for when and how the data is updated.

One curiosity is that sometimes the data changes a few days after the 'filing_date'. This implies that Morningstar took a day or two to update their data and get it posted. The implication however, is that the 'street' knew this data on the filing date (or the following trading day), but we had to wait a day or two before our algorithm could act on it. Another curiosity is sometimes the 'filing_date' is BEFORE the 'period_ending_date". Maybe someone can explain that to me?

Anyway, hope that helps.

Here is a notebook showing an example of filing dates and period ending dates. For anyone using fundamental data it's helpful to see what the data looks like and when it's updated.

Dan,

I appreciate all the thought into this. It seems like comparing EPS YoY has a lot of nuances. Some background to what I'm doing.

I've been following AAII stock screens for a number of years and I wanted to backtest their CANSLIM screen:
http://www.aaii.com/stock-screens/screendata/CANSLIMRev?ct=e87d70da4948fb2a58e0d2d77c132d1644f2a4cbe0bdbf035e243a7488971ae73784856fd67ea0a8d0fa85c4ba2462251f8140af947e1945ac1a6c31586ee9ab

I believe I read on one of the AAII forums that their data provider was Reuters. I wonder if they provide the data in an easier consume format.

Here is the description of the stock screen:

The growth rate in earnings per share from continuing operations between the last reported fiscal quarter (Q1) and the same quarter one year prior (Q5) is greater than or equal to 20%
The growth rate in earnings per share from continuing operations between the last reported fiscal quarter (Q1) and the same quarter one year prior (Q5) is greater than the growth rate in earnings per share from continuing operations between the reported fiscal period two quarters ago (Q2) and the same quarter one year prior (Q6)
The growth rate in sales between the last reported fiscal quarter (Q1) and the same quarter one year prior (Q5) is greater than 25%
Earnings per share from continuing operations for the last reported fiscal quarter (Q1) is greater than zero (is positive)
Earnings per share from continuing operations for the last trailing 12 months (last four fiscal quarters) (12m) is greater than earnings per share from continuing operations for the last reported fiscal year (Y1)
Earnings per share from continuing operations for the last fiscal year (Y1) is greater than earnings per share from continuing operations from two fiscal years ago (Y2)
Earnings per share from continuing operations from two fiscal years ago (Y2) is greater than earnings per share from continuing operations from three fiscal years ago (Y3)
Earnings per share from continuing operations from three fiscal years ago (Y3) is greater than earnings per share from continuing operations from four fiscal years ago (Y4)
The consensus earnings estimate for the current fiscal year (Q0) is greater than fully diluted earnings per share from continuing operations for the last reported fiscal year (Y1)
The compounded, annualized growth rate in earnings per share from continuing operations over the last three years is greater than or equal to 25%
The current stock price as a percentage of its 52-week high is greater than or equal to 90%
The percentage rank for relative strength over the last 52 weeks is greater than 80
There are at least 10 institutional shareholders
The number of shares purchased by institutions over the last quarter is greater than or equal to the number of shares sold by institutions over the last quarter

Dan, I wanted to thank you for the example above. The tutorials for the pipeline tutorial are very good, but this was a question I also had. The code example is valuable for expressing fundamental trend ideas, as well as the issues you bring up about timing and availability of data I'll have to deal with.

I'm just learning Quantopian now, coming from long time user of AAII Stock Investor Pro who always wanted to backtest all the ideas I've been building screens and spreadsheets for over the years based on my interpretation of strategies. Also coming from some work in amibroker which didn't have the fundamental data, but allowed algorithmic trading that I'd like to integrate and test here with clean data to avoid survivorship bias.

That example was very helpful. I created a few algorithms using the 63-day quarter period if anyone wants to see what it looks like. One of them goes back 9 quarters to make a moving average calculation. Haven't tried out the filing date and quarter ending date fields yet, so this method might not be completely reliable. Should be enough information to set up that AAII screen, though.

Thought it was all quarterly data, not completely sure on that though. I did make a few mistakes in this version of the algorithm, the window
lengths are supposed to change by 63 each time and the last quarter variable uses free cash flow yield when it's supposed to be gross margin.

I fixed the algorithm, this version doesn't do as well but I made another one with two fundamental metrics that performs better.

I've been doing some auditing of morningstar fundamental data and have come to the conclusion that it needs some serious data scrubbing before I'd be comfortable running any algorithm off of it.

For many financial statement line items, I can find .latest values that are stale by a few quarters. Allocating a portfolio position to a company that hasn't reported in a year due to financial difficulties could explain some of your poor performance. To ameliorate that risk you'll have to do things like validate whether the as_of dates are within a certain proximity to the last filing date. I've started writing a bunch of cleaning functions to fix issues like this, but I keep finding new things that need to be fixed. It's a pretty big job, not to mention that I frequently run into memory problems trying to simply extract data for more than a couple of years. Data quality issues and constant freezing in the IDE and research environment makes working with morningstar financial data a harrowing experience. As it stands now I don't think I can reproduce the results of many accounting academic studies (using compustat data), without undertaking a huge data scrubbing project first. I wish someone from Quantopian would provide some sort of update on the firm's fundamental data quality initiatives.

If the results aren't matching academic studies, it might also be caused by Quantopian calculating slippage and commissions differently.

Hey Eric, Bing,

I have been trying to write similar algoritms, looking back at fundamental data from 1 year and 2 years ago and ran into the same issue. I constructed CustomFactors as Dan suggested, for which many thanks, Dan! But I noticed the morningstar data is very unreliable without cleanup. For example, the 'cash flow statement' section is not usable at all, some of the factors there have not been updated in 5 (!) years. For the 'balance sheet' factors, I have seen a lag of sometimes more than one year as well. So far, the only reliable fundamental data I have encountered is the factors in the 'income statement', and only when looking at the Q1500 universe.

So I can confirm the performance discrepancy (Which I have also seen when trying to reproduce academic studies based on Compustat) is not solely due to slippage or comission implementations, but rather due to poor quality of the data...

I agree with Bing that it is a pity that the quality of fundamental data is so unreliable, making fundamental factor models incredible hard to reproduce with results that match the financial literature. It seems Quantopian's focus is more on using technical analysis, as I also found retrieving fundamental factor data of over 10 factors for more than a year in one pipeline to be nearly impossible to implement due to slow backtesting and therefore painfully slow debugging.

That is a pity. Maybe it's possible to use the income statement values to reconstruct or update values from the other financial statements instead of using stale built-in metrics. Probably wouldn't run very fast, though. Also, how reliable are the alternative data sources? Saw another post today about missing earnings calendar data.

Hey Eric,

that is indeed what I did, and as you guessed it, it makes the algorithm painfully slow to work with. I am currently looking to step away from Pipeline, as it seems much faster to retrieve historical fundamental data by using the 'get_fundamentals' implementation... All the wile I was under the impression Pipeline was the way to go, as I felt Q was advocating for it everywhere, but sadly its memory demands when retrieving > 200 days of history for 10 factors are hard to work with.

Seems like the best fix would be convincing Morningstar to implement the updates, it would reduce load on Quantopian's servers and would make Morningstar's data more useful. It would also help anyone else who's using Morningstar's data.

As for Pipeline, I also noticed that it was a lot slower than get_fundamentals. Still using Pipeline myself though because it allows you to pull up fundamental metrics from previous quarters and I'm not sure if that is possible with get_fundamentals. Saw some old algorithms with get_fundamentals that ran for a year to collect data so they could do a year-over-year comparison.

Hey Jens,

Another option that you might try, that I've been thinking about implementing myself later, is just downloading new fundamental data by stock a day after their most recent as_of dates change for certain items. If you think about it, companies report quarterly, so you really only need to update fundamental data for any company 4 times a year. The other days can be interpolated/forward filled using pandas functions using the last available financials. Another option would be to update fundamental data on weekly or monthly intervals. There's a Quantopian forum post somewhere that includes a function I wrote to update on a weekly basis.

But all of the above is a moot point with all the missing items I've encountered in quantopian's morningstar data.
For instance, all the following fields in the income statement have NaN values for every stock in the universe for the entire history of the dataset, which just doesn't match up with sec filings. I am uncertain how much time I should devote to Quantopian until this gets fixed. Can someone from Quantopian provide a time frame? Are we talking like within the next quarter or another year?

IS_ = morningstar.income_statement.
IS_earning_loss_of_equity_investments
IS_gain_losson_derecognitionof_available_for_sale_financial_assets
IS_gain_losson_financial_instruments_designatedas_cash_flow_hedges
IS_impairment_losses_reversals_financial_instruments_net
IS_total_other_finance_cost IS_gainon_extinguishmentof_debt
IS_gainon_redemptionand_extinguishmentof_debt
IS_losson_extinguishmentof_debt

We have been working on a new backend for the Morningstar fundamental data available on Quantopian. The new backend should solve a lot of the problems that have been brought up on this thread. The new backend should provide corrections on some of the missing or incorrect data and pipeline should be able to load the data more quickly. We're actively working on this project and are making good progress. We'll make an announcement in the community when it's available, which I'm hoping will be relatively soon.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

That's exciting to hear, Jamie!

Hi @Jamie,

How can we screen for stocks with "no reported data since .... user specified date"?
i.e. to ensure that any stocks for which the fundamental data in our particular fields of interest have not been updated "sufficiently recently" get removed from our universe? Do you have any relevant example algo / template / code snippet?
Cheers, best regards, Tony