New Dataset
Today, we added a new dataset to the platform: FactSet Fundamentals. The FactSet Fundamental dataset provides access to more than 800 corporate fundamental data fields. The full list of available fields can be found on this reference page. FactSet Fundamentals data is available in Research and Backtesting via the Pipeline API. You can access FactSet Fundamentals data in Pipeline like this:
from quantopian.pipeline.data import factset
quarterly_sales = factset.Fundamentals.sales_qf.latest
Usable in the Contest
Since FactSet Fundamentals are available in backtesting, you can use the dataset in the contest. As requested by a community member in another thread, we are increasing the limit on the number of contest entries per person to 4 so that you can make a new entry with FactSet Fundamentals without having to pull one of your existing entries. Algorithms that use FactSet Fundamentals are eligible to be considered for an allocation.
Attached is an example notebook that uses FactSet Fundamentals in a pipeline to help get you started.
New Features
There are some new features that are unique to FactSet Fundamentals (these features do not apply to the Morningstar Fundamentals integration):
Reporting Lag Simulated Based On Fiscal Quarter
Most fundamentals data comes from public reports that companies are required to publish either quarterly or annually. Companies file these reports after the close of each quarter/year, but the exact amount of time between the period end and the filing is different from company to company and even from period to period. In the US, for example, companies have 45 days to file their quarterly reports for Q1, Q2, and Q3, but they have 60 days to file for Q4.
Assuming that quarterly reports were available immediately after the close of a company's fiscal period is an easy way to introduce lookahead bias into a model. To prevent this form of lookahead bias, Quantopian timestamps each data point as it is downloaded from the vendor on a nightly basis. We use the timestamps to inform Pipeline when each data point can be introduced into the simulation.
Of course, the approach of timestamping data as it is downloaded doesn’t work with historical data that existed before Quantopian started collecting it. We model timestamps of historical data points by using the report dates provided by the vendor. When a report date isn’t provided, we lag the fiscal period end date. In our Morningstar integration, we lag the quarter/year end date by 45 days - the maximum allowed time to file the report for companies in the US in Q1-Q3. For our new FactSet integration, records without a file date are lagged by 45 days for Q1-Q3, and by 60 days for Q4, which is the maximum allowed time to file the year-end report for companies in the US.
The new lag approach will be applied to FactSet Fundamentals in other markets as well. As we roll out new markets, we will be adding fundamentals data with a similar lag algorithm. The number of days to lag will vary based on the reporting laws in each country.
Data Holdout Period
Quantopian Community access to the FactSet Fundamental data is subject to a trailing 1-year holdout period. For example, because today is October 11th, 2018, you can access FactSet Fundamentals data through October 11th, 2017. Unlike our subscription-based datasets, the free version of the FactSet data will be usable in contest algorithms without a subscription. Similarly, algorithms that use FactSet Fundamentals will be evaluated by Quantopian using up-to-date data. Submitting an algorithm to the contest that uses FactSet data will follow the usual submission process. The only difference is that your backtest prior to submission can’t run through the holdout period.
We are hopeful that the holdout will help to reduce the risk of overfitting. Since Quantopian will be using up-to-date data to evaluate and score contest entries, this is a good opportunity to build factors on in-sample data and test them in the contest using out-of-sample data.
Coming Soon: Global Coverage
FactSet Fundamentals data has global coverage. As we expand the Quantopian platform to support global equity research, we will be adding FactSet Fundamental data for each new market. Our global expansion is starting in Research. Backtesting will not support global equities right away. We will be making another announcement soon with details about the global integration.
Other Notes
- This data is now usable in the contest. We encourage you to play around with it and see if you can incorporate it into your research! And as always, please let us know if you discover any issues or if you have any questions.
- Pipeline allows you to access FactSet Fundamentals data back to 2004.
- The update frequency of various fields are denoted with a suffix. For example, fields ending in
_qf
are updated at a quarterly frequency,_af
are updated at an annual frequency, and_saf
are updated at a semi-annual frequency. See the FactSet Fundamentals reference for more information. - Currently, our FactSet Fundamentals integration doesn’t include their Last Twelve Month (LTM) data, nor any per-share fields. We plan to add both of these types of fields at a later date.
- We are expecting to apply a minor update to this dataset in the next couple of weeks. Specifically, there are some data points in September/October 2018 that are surfaced a day late. When the update is applied, some of your simulation results might change, but it should only be for the September/October 2018 period. Since this data is in the holdout period, you would only notice this if you use FactSet Fundamental data in a contest algorithm.
Happy coding!