Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
How are we supposed to upload our own Classifiers?

I've been attempting to create my own Classifier from a custom dataset on which to use some of the groupby variables. Unfortunately, it's not clear how to take a Factor (the only thing I can produce from custom datasets because there's only a "number" option, not an "int") and convert it into a Classifier. It would be extremely useful for there to be an as_classifier() method on Factors or the ability to recast Factors as Classifiers. I've tried setting the BoundColumn to an np.int64 type and then using .latest, but that fails some assertion and says the type isn't recognized. Is this a bug? How do I create classifiers from custom datasets?

1 response

Great question. How to import classifiers using self-serve data? First, why would one want to import a classifier? A lot of reasons, but take a simple example. One has a dataset of analyst recommendations. The values are either 'buy', 'hold', or 'sell'. Or, potentially just the numeric values 1, 2, 3 respectively. The nice thing about classifiers is they can be used for grouping (as Robert alluded to above). So, as an example, one could create a filter for the 5 lowest priced stocks in each rating.

        # Assume my_self_serve is the imported dataset name with a string field called 'rating'  
        price = USEquityPricing.close.latest  
        rating = my_self_serve.rating.latest

        lowest_priced_in each_rating = price.bottom(5, groupby=rating)  

What makes this work is setting the column type to 'string'. When setting up the self serve data feed, select 'string' for any columns one wants to use as classifiers. Then, magically, when fetching the latest property of a dataset (eg my_self_serve.rating.latest) one will get a classifier.

That's the straightforward way to make a classifier from self-serve data. Set the column type to 'string'. Stop reading here if that's all you care about.

So, since you are still reading, you may want to know a bit more....

Below is the code for the latest property of a BoundColumn. It can be found on Github here (https://github.com/quantopian/zipline/blob/master/zipline/pipeline/data/dataset.py)

   @property  
    def latest(self):  
        dtype = self.dtype  
        if dtype in Filter.ALLOWED_DTYPES:  
            Latest = LatestFilter  
        elif dtype in Classifier.ALLOWED_DTYPES:  
            Latest = LatestClassifier  
        else:  
            assert dtype in Factor.ALLOWED_DTYPES, "Unknown dtype %s." % dtype  
            Latest = LatestFactor

        return Latest(  
            inputs=(self,),  
            dtype=dtype,  
            missing_value=self.missing_value,  
            ndim=self.ndim,  
        )

Without getting into the specifics of the code, notice that it's creating a filter, classifier, or factor based upon the BoundColumn datatype. This is 'the implicit' way to create a classifier. It's 'implicit' because the type of object being created is implied by the type of data in the BoundColumn.

Now, one could also create a classifier by 'explicitly' creating it using the CustomClassifier class. You may want to do this if the BoundColumn data type is 'number' or if one wanted to do some other data manipulation.

from quantopian.pipeline import  CustomClassifier  
import numpy as np

class Rating_Classifier(CustomClassifier):  
    inputs = [my_self_serve.rating]  
    window_length = 1  
    dtype = np.int64  
    missing_value = 9999  

    def compute(self, today, assets, out, rating):  
        out[:] = rating

rating = Rating_Classifier()

The key to setting up a CustomClassifier is to specify the dtype as 'np.int64'. Also set missing_value to some number to display if the rating is missing. One can then set the column type in the dataset setup to 'number' and use those numbers as a classifier.

Again, great question. Good luck.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.