Environment Variables for the Run

Hi Gary,

Did you ever get a response on this?

Grant

Thanks for asking. Yes, mainly today actually, a Saturday no less. I wanted it done yesterday if not sooner, my sister's wedding has no importance by comparison.
I should clarify a couple of things for anyone who might be interested or in case it might be helpful ...

Regarding the comment above on 'definitely planning to keep that functionality', that was just a natural mixup between str(dir) and the very different str(dir()) with parens.

dir() by itself will return an error however if it is supplied with one arg it will return a list of the things that can be done with that thing, a Python feature I suppose.
For example, an integer and empty structures below:

dir(2)  
['bit_length', 'conjugate', 'denominator', 'imag', 'numerator', 'real']

dir('')  
['capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

dir([])  
['append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']

dir({})  
['clear', 'copy', 'fromkeys', 'get', 'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values', 'viewitems', 'viewkeys', 'viewvalues']

... or how about on data for example:

dir(data)  
['has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'values']

Meanwhile, str(dir) has been confirmed as a clever hack, with congrats and accolades and was turned off.

How would I put this if I had any marketing skills whatsoever: Everyone is going to feel really good about it when the official environment info is made available (right away, eh-hem), neighbors will be envious about the new feature and the world will spin more true on its axis. :)

Gary,

If I read you correctly, the functionality (perhaps in another form) will be restored eventually. Or no? It seemed pretty darn handy, and harmless.

Grant

Hi Gary and Grant,

The team has talked about this issue a lot since it was brought up. Today I was able to catch up with their thinking. Our feeling is that what you (and others) are asking for is a terrific feature request. I see two use cases we need to support in the API, which I describe below.

The use of str(dir) to get your simulation parameters is definitely not behavior we intended (or even noticed before you did, tbh). Simulation parameters printing via str(dir) is an implementation quirk of our custom dir method.

Until we provide the API you need, you can use str(sid) to get a similar response. Once we provide the API method, please expect that the string representation of interal methods, and other undocumented hacks (however clever they may be :)) will change. We will always do our best to help users who stretch the system in unexpected and creative ways. However, undocumented hacks are risky since we can only support and maintain our documented API methods.

Regarding our future plans:

I see a two use cases from reading your code:

a method that is called after the close of each session, to analyze an algorithm's performance (currently, you're approximating by calculating the time its first/last bar will be)
method for algorithm to check the environment: minutely/daily, backtest/live, capital_base, etc.

For case #1, we have wonderful API support for scheduling:
https://www.quantopian.com/help#api-schedulefunction

For use case #2 above, we could extend get_environment() to return a dictionary of information about the parameters of your backtest. Here's a quick spec (based in part on Gary's email to me about the needed environment variables):

get_environment(field='platform')  
"""
    field: the name of the environment variable to return.  
    '*' is a special value to fetch all fields.  
    Default value is backward compatible with today's get_environment().

    returns: The envionment is stored as key/value pairs.  
    The value associated with the field parameter is returned.  
    If field='*' the entire environment is returned as a dictionary. 


        {  
            'arena'         : 'IB' | 'IB_paper' | 'paper' | 'backtest'  
            'data_frequency': 'daily' | 'minute'  
            'start'         :  
                UTC datetime for start of backtest (inception for live/paper)  
                e.g. 2014-11-25 14:31:00  
            'end'           :  
                UTC datetime for end of backtest (today's close for live/paper)  
                e.g. 2014-11-26 21:00:00  
            'capital_base'  :  float of the original capital in US$.  
                               e.g.1000000  
            'platform'      : 'quantopian' | 'zipline'  
        }

"""

Thank you for using Quantopian. We love having you in our community.

thanks,
fawce

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks Fawce,

Regarding the new API schedule_function I gather that it supports repeated calls to a function, but from the help page it is not obvious how to schedule a backtest post-processing call (function to be called after the backtest is complete, as initialize is called at the outset). Also, I assume that the function still has to execute within the 50 second time-out window of handle_data. I would consider if it would be better to have a true post-processing routine, with access to the results that are displayed in the GUI at the backtest end, along with any custom data stored by the user.

You might also think about the initialize/pre-processing and post-processing in the context of the research tools being developed. There need to be some hooks to initialize a backtest (set up in the online GUI backtester) from the research environment, and then post-process the results and store them for analysis in the research environment. This way, the backtest algo can be abstracted as an objective function for optimization, as one use case.

On the topic of the crash that Gary experienced, how did the change end up affecting Gary's live algorithm? Are live algos subject to code revisions? Or once the code is running, is it locked to a specific production release? You could consider giving users control over the version, so that they can decide if they want to risk automatic updates, or manage the changes (e.g. launch a duplicate algo and test it under the the new Quantopian code version).

Grant

Thank you John and everyone attending to it.
As many know, I've been working on Run Summary (really attached to it), new version there, and with this change I'm picturing the relevant area to be 25 fewer lines of code like:

            'init_cash': get_environment('capital_base'),  
            'arena'    : get_environment('arena'), # 'backtest' | 'live' | 'paper'  
            'mode'     : get_environment('data_frequency'), # 'daily' | 'minute'  
            'first_trading_date': str(get_environment('start').date()),  
            'last_trading_date' : str(get_environment('end')  .date()),  
            'last_trading_time' : str(get_environment('end')  .time()), # UTC

Earlier I had thought it might make sense for get_environment() to only be available from initialize() since the values do not change, however for the Run Summary thing that would require an additional call to summary() from within initialize(). I guess I've been trying to make the code as simple as possible for someone to drop into their own algo super easily, can currently be a single paste at the end of handle_date() usually. (The call to summary() and the summary() function immediately following). I agree that schedule_function() is great, just that case #1 above would add an additional paste. 100% more, for two copy/paste. Eah, anyway ...

I'm really looking forward to this change. Thanks again!

Trying to take an algo live with real money today, the interim workaround str(sid) fails there.

What's the failure?

Disclaimer

Grant,

On the question about platform change management, we have recently finished our test harness for code changes.

Prior to any production changes we:

find every algorithm running against IB
run each algorithm with prior version of the platform, storing simulation results
run each algorithm with the new version of the platform, again storing simulation results
compare prior and new results

If the simulations encounter any exceptions, or if we find unexpected discrepancies in the simulation results, we don't release the update.

thanks,
fawce

Disclaimer

Thanks Fawce,

Sounds pretty thorough! You might just cut-and-paste your response above into the help/FAQ docs, if it'll be the standard going forward.

In any case, it's kinda scary that the code will be constantly re-worked. I guess it ain't firmware.

If you end up with 10,000 algorithms running in production, your testing approach could be difficult to scale, no?

Grant

Hi,

Not changing the code scares me more :).
The simulations run in parallel on a dynamic array of machines in AWS. 10,000 algorithms would be about 1-5k machine hours, so an array of a few hundred machines would finish the work in a reasonable time.

thanks,
fawce

Disclaimer

@Fawce Btw str(sid) was ok in paper trading, I had a try/except with str(sid) live and only know the except kicked in.

Hi,

We just shipped an update to get_environment that works as described above. You can see the full story in the help docs: https://www.quantopian.com/help#api-get-environment
There is also a code-assist in the IDE.

Thanks again for the feedback,
fawce

Disclaimer