sid error

Back to Community

sid error

posted Sep 15, 2012

I attempted to modify the sample algorithm provided by Quantopian (Multiple Security Example), by replacing the fixed stocks with random ones:

  s1 = random.randint(1, 5061)  
  s2 = random.randint(1, 5061)  
  s3 = random.randint(1, 5061)  
  s4 = random.randint(1, 5061)  
  context.stocks = [sid(s1), sid(s2), sid(s3), sid(s4)]

When I build the code, I get a "The sid(id) method takes one parameter" error. I get the same error with this code:

  s1 = 24  
  s2 = 2  
  s3 = 2673  
  s4 = 5061  
  context.stocks = [sid(s1), sid(s2), sid(s3), sid(s4)]

Any idea why I am getting the error?

7 responses

John Fawcett

Sep 15, 2012

@Grant, thanks for posting this question. Looks like you were trying to work around our need for statistical sampling of sids, but you ran into a related limitation.

To manage the data queries necessary for the simulation, we parse the code pre-execution and find all calls to the sid() function. That way we can configure the simulation to run with just the data you need. As a result, we can only process ints as primitives. So, sid(24) works, but sid(s1) as above, will not.

I think this approach of specifying specific instruments is appropriate for technical analysis and other stock specific techniques. Most of our users are clamoring for support for stat-arb style algo development, where the algorithm is also responsible for choosing the universe of securities. We are working on that now, and our plan is to provide a simple magic function in the initialize method of your algorithm:


where **universe** is a dict keyed by either a datestring in the form YYYYMMDD or a datetime (we'll force the time to midnight of the given date). The value of each entry will be a list of ints or sids. Before each market open, the universe will be reset to match the list of sids in dictionary with the latest date before the open.

You could then do the random sampling you have above, against the list of sids you supplied.

We need to do some testing to figure out how many sids we can allow per market day. We may need to start with something very conservative, like 10 or 25. Seems like 500 is a reasonable goal, based on others' feedback. What do you think?

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Grant Kiehne

Sep 16, 2012

@Fawce, I'd encourage you to keep thinking about how not to limit the number of SIDs available to a given algorithm. For a broad study/screen of securities, it seems like one would want to sample from the entire universe, rather than place an arbitrary limit. If I understand your reply above, a user would be able to sample from a limited universe of 500 SIDs. If the user were interested in an unbiased sampling of the entire Quantopian universe of SIDs, first he'd need to randomly pick 500 SIDs from that universe to be used in the backtest algorithm. Then, the subset could be sampled via the Python code running the algorithm. This approach would be doable, but kinda awkward, since the subset of 500 randomly selected SIDs would need to be generated by the user outside of you application. Over time, the subset would become biased (e.g. by IPOs, de-listings, etc.), so it would need to be periodically refreshed manually by the user.

There seem to be some underlying computer hardware constraints at play here. Are you thinking in terms of what would run on your present system (I'm guessing that you have a single server running Quantopian)? I'd encourage you to consider what you could do with parallel/cluster/distributed/cloud/GPU computing. Perhaps industry folks can comment, but my assumption is that for screening, hedge fund types (and academics) are submitting batch jobs that run in the background on commodity "supercomputers" for hours/days. So they'll have an advantage in finding opportunities.

John Fawcett

Sep 16, 2012

Hey @Grant,

Thanks for the post!

Would it be simpler if we just had a magic that set the universe to a random sample of the available securities? Most of the computation will be in the trailing window support we are adding - where a history of N days for M securities will be passed as a dataframe to your algorithm. So, really the purpose of setting the universe is to make the size of the dataframe reasonable. We need to do more experimentation to figure out what, if any, this limit should be. I threw 500 out there because it was an order of magnitude I have been hearing from other Quantopians, not because of any inherent computing limits.

The bigger issue with universe selection is data rather than compute. We also envision a future where the platform provides much more diverse data to your algorithm for universe selection. The ultimate idea is to provide an API to screen securities over time, so that your universe could be defined as "large cap stocks with P/E < 5", and those criteria could be applied repeatedly over the course of the backtest to dynamically update the universe. Until we can provide the broader data, I wanted to allow "manual" universe selection to approximate data-driven screening.

Thank you again for sharing your thoughts; we love hearing from you.

Disclaimer

Grant Kiehne

Sep 17, 2012

Thanks Fawce,

I'll wait for the planned changes to your system. As you describe in an earlier post, I'll be able to do some form of algorithmic sampling once the change is in place and documented. If I understand correctly, I'll need to provide a fixed list of SIDs from which to sample. Since I'm just tinkering around, you can provide a list of all valid SIDs and I can pick a batch of random ones, up to the limit (e.g. 500).

zhang zheng

Nov 7, 2016

@John Fawcett
If i have defined the set_universe() function in my algorithm. I think there is no need for you to scan the sid or symbol function to configure the simulation to run with just the data the algorithm needs. because you just need to load the stocks i have defined in the set_universe() function.

looking forward for your reply.

thanks

John Fawcett

Nov 7, 2016

Hi Zhang,

In the years since this thread began, we've introduced many new tools for managing the universe of securities. The culmination of all that work is our Pipeline API - you can get acquainted with pipeline with this tutorial. I hope you find it helpful!

happy coding,
fawce

Disclaimer

zhang zheng

Nov 7, 2016

Thanks Fawce,
I know a little about how to use the Pipeline API to set the universe of securities. however, my question is not about how to set the universe of securities. it's about Why do you still scan the sid and symbol function to prepare the data algorithm needs when I have defined the universe?
In my opinion. The set_universe function is used to tell the backtest engine to prepare the data algorithm needs. so there is no need to scan the sid or symbol function.

zhang

You've successfully submitted a support ticket.

Our support team will be in touch soon.