Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Change in Fundamental Data in Research

I'm trying to compare a change in certain fundamental data values over time. For instance, change in overall cash position over time. I would like to compare the most recent cash position to the previous cash position and get the difference.

How would I go about something like this? I'm mostly having trouble figuring out how to access the previous cash position.

4 responses

If anyone knows the answer I'm still looking for a response

This is an excellent question. Changes in fundamental values are often associated with changes in overall company value and could predict a change in stock price. Developing a factor for change in fundamental data could maybe yield some alpha.

However, determining a change in fundamentals is trickier than it would seem. The basic problem is how the fundamental data is updated and presented to the algorithm. When new fundamental data becomes available it is repopulated back to the 'as-of' date. As an example, say XON year end cash as of 12-31 changes. It's not reported until perhaps 3-1. Then it's not actually loaded into the morningstar database until a few days later on 3-5. So, up through 3-5 the data was really from the previous quarter. But, voila, on 3-6 all the data between 1-1 and 3-5 gets magically updated to the 12-31 values. A custom factor on 3-6 never sees a change (it's all the 12-31 data).

One solution is to look back over a period of time, say 90 days, and check for any changes in the data. One way to do this is with the numpy 'unique' method. Something like this [NOTE the below code has been updated/corrected from the original post. The attached notebook reflects the original WRONG code. I've attached an updated/corrected notebook in a subsequent post]


class DataChange(CustomFactor):  
    window_length=90  
    outputs= ["pct_change", "days_since_change"]  
    def compute(self, today, assets, out, data):

        def column_change(column):  
                # The numpy 'unique' returns the unique values and the  
                # indexes of the first occurances of those values  
                value_at_changes, index_at_changes = np.unique(column, return_index=True)

                # Note that the 'unique' method sorts the results by low to high values  
                # We really want them in the original time-sequenced order  
                # So sort the index and fetch the associated values from the original data  
                index_at_changes = sorted(index_at_changes)  
                value_at_changes = column[index_at_changes]  
                if len(index_at_changes) == 1:  
                    # The 'unique' method only found a single value - no data change within the window  
                    change = 0.0  
                    days_since_change = np.nan  
                else:  
                    # There was a change. We just want the last change so check [-1] and [-2]  
                    change = (value_at_changes[-1] / value_at_changes[-2]) - 1.0  
                    days_since_change = self.window_length - index_at_changes[-1] 

                return change, days_since_change  
        # Apply the above function across each column and output the values  
        out.pct_change[:], out.days_since_change[:] = np.apply_along_axis(column_change, 0, data)


This will return the percent change of a fundamental along with how many days ago that change happened. It may be wise to check the days value to make sure the change is from the previous quarter (ie maybe less than 65 or something close to that). There may be a more elegant ways to accomplish this?

See attached notebook. At least something to start with. Good luck. And again, good question.

Great answer. I've been using a version of this in many different algos of mine. Here's a code snippet that I've been using fairly frequently:

class AnnualAvg(CustomFactor):  
    window_length=255  
    def compute(self, today, assets, out, data):  
        def column_change(column):  
            window_length = len(column)  
            newvals, index_at_changes = np.unique(column, return_index=True)  
            if len(index_at_changes) < 4:  
                #only 3 unique values or fewer. Data did not update often enough.  
                avg = np.nan  
            else:  
                #At least 4 unique values, therefore we have 4 quarters of unique data. Return average  
                avg = np.average(newvals)  
            return avg  
        # Apply the above function across each column and output the values  
        out[:] = np.apply_along_axis(column_change, 0, data)  

How could I modify this so that instead of last year's average, I have the average of the year before, while still verifying that there the data actually updated often enough. Simply indexing newvals (eg newvals[-5], newvals[-6] doesn't work; it comes out in some crazy jumbled order that isn't chronological.

updated notebook with corrected code (added a sort to the returned 'unique' data).