Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Dynamic vs Static portfolio selection

Hello everybody!

I am quite new to Quantopian so please excuse me if my question is rudimentary.
I would like to explore investing in "sustainable" companies selected from S&P 500 and have created a tool to calculate sustainability measure (int. value 0 - 1) of each company in the index (canopysustainability.heroku.com)
Using this tool, I can select top k sustainable companies or filter out companies whose sustainability measure is than a certain value (ex. 0.2)
Now, if I want a portfolio of sustainable companies, would it be a better idea to statically initialize my portfolio (e. g. context.stocks = [sid(8347), sid(5061), …]) based on selection of top k sustainable companies, or use sustainability measure as a filter in a pipeline?
What I am not sure is if Pipeline has replaced practices of static initialization of portfolios.
I would really appreciate any help.

3 responses

Hi Jenna,

Thanks for posting. You're right that Pipeline is intended to replace static portfolio selection. However, Pipeline only works with data that's on our platform, and we unfortunately don't have the environmental data that you're using to calculate sustainability.

It is possible to use external data on our platform through Fetcher. Fetcher requires multiple rows of data for each stock, each row marked with the date that that data became available. If you have access to the environmental data you used in CSV format with multiple dates and data points for each stock, you could import that data into the backtester and do your calculations there. Or if you could change the CSV on the website to have dates as I described, you could use Fetcher on that. However, if that is too difficult, it seems like it would be fine for you to use a static portfolio in this case.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Dear Nathan,

Thank you so much for your reply.

I am afraid I do not have time-series field in the environmental data, but would it be possible to still use it as a filter to the pipeline?
What I mean is, could I load the environmental data fields with a simple CSV loader and attach it to indices? I would proceed by selecting companies via Pipeline by looking at Quantopian data and making buy/sell decisions. And then, I would further influence the decisions by filtering out the companies whose sustainability indices don't meet user defined criterion.

Hi Jenna,

I would recommend using fetch_csv with a pre_func argument to store your data as a Pandas Series or DataFrame. (fetch_csv will try to process the CSV file into time-series data but will fail; this doesn't matter though.) This Series or DataFrame should be a context variable so it can be accessed anywhere in your algorithm.

Unfortunately, using custom data like this in Pipeline is difficult, so I would recommend doing necessary calculations (on price signals and so forth) in Pipeline as normal, without filtering. Then you can filter the output of that pipeline by tossing out all but the symbols that meet your criteria for the environmental data in your Series or DataFrame.

Let me know if you would like more of an explanation.