Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
CustomFactor (ValueError: setting an array element with a sequence.)

Hi,
I am trying to build custom filter with morningstar.asset_classification.financial_health_grade but I keep getting the following error ValueError: setting an array element with a sequence. It seems like error happened somewhere at out[:]. Can someone help me figure out what I am doing wrong ?

Thank you

from quantopian.pipeline import Pipeline, CustomFactor  
from quantopian.research import run_pipeline  
from quantopian.pipeline.data.builtin import USEquityPricing  
from quantopian.pipeline.data import morningstar  
import pandas as pd  
import numpy as np

class test_class(CustomFactor):  
    inputs = [morningstar.asset_classification.financial_health_grade]  
    window_length = 1  
    def compute(self, today, assets, out, test):  
        out[:]=test  


def make_pipeline():  
    price= USEquityPricing.close.latest  
    class_test = test_class()  
    morning_test =morningstar.asset_classification.financial_health_grade.latest  
    return Pipeline(columns={'moring test': morning_test,  
                             'close_price':price,  
                             'class test': class_test})

my_pipe=make_pipeline()  
results=run_pipeline(my_pipe, '2015-01-01', '2015-01-01')  
results.head ()  
4 responses

You are spot on finding the error to be with the 'out[:]' assignment. The error 'setting an array element with a sequence' specifies the problem pretty well.

The 'out' object is a 1 dimensioned numpy array. Each element needs to be assigned a single value by your 'compute' function code. The value corresponds to the value of your custom factor for a specific asset. The specific assets are passed in the 'assets' array. So, as an example, out[99] should be assigned the factor value for the assets[99] where assets[99] will be a SID for the asset (not the equity object).

Anyway, 'test' is a 2 dimensioned numpy array. Each row is a date. Each column is an asset. The values are whatever the input you supplied.

Your assign statement 'out[:]=test' is trying to set a 1 dimensional array with a 2 dimensional array. Hence the error 'setting an array element with a sequence' . Python is trying to set each array element of 'out' but you are giving it an array.

So, simply be more explicit in what you want to do. The following will set 'out' to the latest value of test. Remember 'test' rows are the dates. The last row is the last date. This slices 'test' to return a 1 dimensioned array which (not coincidentally) is exactly the same length as 'out'. Now Python knows what to do and assigns the single values from the last row of 'test' to each value of 'out'.

    def compute(self, today, assets, out, test):  
        out[:]=test[-1]  

Depending upon what you want your factor to do, the code will change. You can do any kind of crazy calculation inside the 'compute' method. Just make sure you are ultimately setting 'out' with a 1 dimensioned array with a length equal to the number of assets (ie len(assets) ).

Good Morning Dan,
Thank you very much for explaining in details how CustomFactor works. I have tried your solution but unfortunately I get the same error ValueError: setting an array element with a sequence.. please see my code below.


class test_class(CustomFactor):  
    inputs = [morningstar.asset_classification.financial_health_grade]  
    window_length = 1  
    def compute(self, today, assets, out, test):  
        out[:]=test[-1]  


def make_pipeline():  
    price= USEquityPricing.close.latest  
    class_test = test_class()  
    mornin_latest_test=morningstar.asset_classification.financial_health_grade.latest  
    return Pipeline(columns={'class_test': class_test,  
                             'close_price':price,  
                             'morning_latest_test': mornin_latest_test})

my_pipe=make_pipeline()  
results=run_pipeline(my_pipe, '2015-01-01', '2015-01-01')  
results.head ()

It seems to me this problem only occurs if you have array that contains string object, also I am not sure if its array i did small test to see what type of data i am passing to compute and I get following :

class 'zipline.lib.labelarray.LabelArray'

class test_class(CustomFactor):  
    inputs = [morningstar.asset_classification.growth_score]  
    window_length = 1  
    def compute(self, today, assets, out, test):  
        print type(test)  
        out[:]=test  

And just for sanity check if we take any numerical input from morningstar for example morningstar.asset_classification.growth_score as you pointed out we get type 'numpy.ndarray which has no problem in passing to out[:].

Ahhh. The Morningstar data 'asset_classification.financial_health_grade' is a string (ie A B C etc). Factors must return either a numerical or a date valued output (https://www.quantopian.com/help#quantopian_pipeline_factors_Factor).

The 'sequence' that the error is referring to is the string sequence that it doesn't like. It want's a single scaler value.

So, a couple of options... if you just want the latest 'asset_classification.financial_health_grade' there's no need to make a custom factor. Just use the '.latest' method.

health = morningstar.asset_classification.financial_health_grade.latest

You could also do some logic to transform the ABC grades into numbers inside the compute function of the CustomFactor.

As a last resort you could instead create a CustomClassifier instead of a CustomFactor. Classifiers can return strings.They are trickier than CustomFactors though.

Got it! Thank you very much=)