Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Annual Balance Sheet Values

I'm trying to write an algorithm that values companies based on the last few years of a couple of balance sheet items, but I keep running into memory issues.

Does anyone have an efficient way to pull a couple years of annual values for something like total_assets? Ideally, I would be able to pull something like the dataframe at the end of this notebook.

Thanks!

3 responses

This may be what you want.

To get the most current fundamental data simply use the '.latest' method of any field (ie bound column) as shown below.
To get fundamental data as of x trading days ago (eg 252 days or 1 calendar year ago) you will need a small custom factor as shown below.

from quantopian.pipeline import CustomFactor  
from quantopian.pipeline.data import morningstar

class Previous(CustomFactor):  
    # Returns value of input x trading days ago where x is the window_length  
    # Both the inputs and window_length must be specified as there are no defaults

    def compute(self, today, assets, out, inputs):  
        out[:] = inputs[0]

To use the factor just instantiate it as below. Note that you may want to use the custom factor even for the current data (rather than the '.latest' method) because it supports a mask as shown.

assets_current = morningstar.balance_sheet.total_assets.latest  
assets_current_500 = Previous(inputs = [morningstar.balance_sheet.total_assets], window_length=1, mask=universe)  
assets_1_year_ago_500 = Previous(inputs = [morningstar.balance_sheet.total_assets], window_length=252, mask=universe)  

Attached is your notebook with a few cells added to the end showing this approach in practice. Notice that the data matches the data you had gotten previously.

Thanks for the response, Dan. Unfortunately, I still run into the memory problems :(.

But this got me thinking, is there any way to adjust the as of date? Something like this:

class Previous(CustomFactor):  
    # Returns value of input x trading days ago where x is the window_length  
    # Both the inputs and window_length must be specified as there are no defaults  
    offset = DateOffset(years=-1)  
    window_length = 1  
    def compute(self, today + offset, assets, out, inputs):  
        out[:] = inputs[0]  

That would limit the data to 1 row, instead of having to pull 252 rows just to drop all but 1.

I'm surprised you're having memory issues unless you're looking at a lot of factors? I've pulled 10-20 fundamental factors with a mask of the Q1500US without a problem (though its slow).

You might want to look at the 'downsample' filter ( https://www.quantopian.com/help#quantopian_pipeline_filters_Filter ). Never used it, and not sure if it really reduces memory but it's worth a shot.

I don't think what you suggested will work. Behind the scenes I believe the CustomFactor object will be querying all the data anyway. The problem isn't so much inside the compute function but the size of data that it returns. Python will manage the data inside the function and will re-allocate it to other resources once the function is exited.

I saw one thing that seemed odd however, in the compute function originally posted in your notebook

class BalanceSheetItems(CustomFactor):  
    window_length = 1  
    def compute(self, today, assets, out, data):  
        out[:] = data

That last line should really reference a single column of data and not the whole array. I believe numpy figures out what you mean but if you are doing something like that elsewhere it may cause problems? Maybe?

Something like this.

out[:] = data[-1]

# or more precisely

out[:] = data[-1,:]