why import quantopian.algorithm as algo?

Back to Community

posted

Why would I want to use this:

import quantopian.algorithm as algo

It is in the the template (see attached), and sometimes Quantopian support uses it in examples. What's the advantage? It just seems like fluff...

8 responses

Dan Whitnable

Grant, good question I've been wondering about that too.

Josh Payne

An answer from one of the engineers this morning:

It’s more pythonic to import things instead of assuming they will be there magically.   We provide a bunch of things pre-imported but it’s not obvious, so importing them is more explicit.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Grant Kiehne

I think I get it. Instead of doing this:

from quantopian.algorithm import attach_pipeline, pipeline_output, order_optimal_portfolio

I would do this:

import quantopian.algorithm as algo

but then it is not clear where I need to use the algo prefix thingy, and where it is not needed. Is there a list of what needs to be imported explicitly from quantopian.algorithm? I guess my conclusion is that since I am not aspiring to be more Pythonic, I'll not get hung up on it. Also, I thought it was better practice not to import entire modules unless they are actually needed? Doesn't it use up memory, unnecessarily?

I would suggest dropping it from your basic examples, since the code will be much more readable, and you won't have to answer questions about it again.

Scott Sanderson

I think there are two different questions here:

Why in Python, might someone prefer to write from module import name vs. import module as mod?
Why, on Quantopian, would I import a name that's pre-supplied into your algorithm/notebook namespace?

The short answer to (1) is "the performance difference is negligible, and importing just the module can reduce clutter and make it easier to see at use-sites where a function came from".

The short answer to (2) is "you don't need to, but it's helpful to do so because it makes Python code written on Quantopian behave more like normal Python code, which makes it easier to teach and understand for newcomers and people who already know Python".

Background: Python Imports

One of the things many people like about Python is that, in normal usage, it's always possible to figure out where a name is defined by reading the current file.

A name is always either:

A local variable in the current function.
A closure variable defined in an outer nested function.
A global variable defined at module scope in the current file.
One of a small number of built-in functions.

Many popular languages don't share this property, because they include constructs like C's #include or Java/C++/C#'s implicit this inside methods. Python does allow you to do from module import * to get all the names from a module, but doing so is generally discouraged outside of the interactive console.

Python provides the import statement as a way to copy values defined in another module into the current module. There are a few different ways to import things, but the ones that are most relevant to this discussion are:

Importing a Module With an Alias:

import some.module as alias

Importing One or More Attributes From a Module

from some.module import name1, name2, name3

In the first version, the Python interpreter reads the file that corresponds to some.module (normally a file at some/module.py relative to a location on your PYTHONPATH) and executes that file to create a module object. The interpreter then binds that module object to the name alias in the current module.

In the second version, the Python interpreter does the same thing to constrct some.module, but instead of assigning the module to a variable, the interpreter copies the desired attributes from the imported module to the current module.

In other words, the second version is roughly equivalent to:

import some.module as _anonymous  
name1 = _anonymous.name1  
name2 = _anonymous.name2  
name3 = _anonymous.name3  
del _anonymous

This means that there's no meaningful performance difference between importing a module and importing attributes of the module. Python always imports the entire module no matter what, and in both cases the module gets cached so that any code that imports a value from some.module gets the same value.

When Should You Import a Module vs. a Module Attribute?

Since there's no meaningful performance difference between importing a module and importing an attribute from a module, the reasons to prefer one vs. the other are mostly cosmetic. As with most questions of style, there aren't hard and fast rules, but here are some semi-objective reasons to prefer one style vs. another.

When I do import some.module as mod, I have to type mod.attr instead of attr when I want to use an attribute of that module. This can be both good and bad:
- Prefixing with a module at use-sites can be good because makes it clearer where a function or class was defined. This matters most often when you have multiple functions with the same name defined in different modules. For example, numpy defines a max function that's specialized to work on numpy arrays. This shares a name with Python's built-in max function. By doing import numpy as np and using np.max instead of max, I make it clearer to a reader of my code that I'm using numpy's max instead of Python's built-in max.
- Prefixing with a module at use-sites can be bad because it makes your code more verbose. If you use a function many times, it can get tiresome to read and/or type mod.func instead of func.
When I do from some.module import name, I can use name directly. This makes my code less verbose at use-sites, but also means that it's harder to see at a glance where name came from.
When I do import some.module as mod, I don't have to edit any imports if I want to use an attribute of mod that wasn't previously being used in my current file. I can just type mod.attr. This is good because it means that I don't have to break my train of thought to edit an import while I'm working. One downside of this is that it's harder for me to find all the things from mod that I use in a given file, because there's no single place where I declare all the attributes of mod that I'm using.
When I do from some.module import name1, name2, I have to explicitly list out all the attributes of some.module that I'm using. This means that I have to edit an import any time I want to use a new name, but it also means there's a single place in my file where a reader can see what I'm using from some.module. Having all the imported attributes can be useful in large projects, but it can also create clutter in files with many dependencies (for example, at the time of this writing, the first 125 lines of zipline/algorithm.py are all imports).
When I do import some.module as mod, I don't resolve the attributes I'm using until runtime. If, in a function in my program, I typo and accidentally write mod.atr instead of mod.attr, my program won't crash until I execute that line, which might only happen after the program has been running for a while.
When I do from some.module import attr, if I import the wrong name, my program crashes immediately and tells me that I tried to import something that doesn't exist.

I tend to use import some.module as mod when I know that I'm going to use multiple attributes of some.module, but I'm not sure ahead of time which attributes I'm going to use. numpy, pandas, and most of the quantopian modules submodules fall into this category, which is why most algorithms I write on Quantopian start with a preamble like:

import numpy as np  
import pandas as pd  
import quantopian.algorithm as algo  
import quantopian.optimize as opt  
import quantopian.pipeline as pipe

pipeline is the module that I'm most likely to add more specific imports for, because pipeline has a lot more nested structure for all the different vendors we have. In general though, I'd like to reduce the number of distinct imports that are required to use the Pipeline API effectively.

Why Should I Import quantopian.algorithm at All?

All of the discussion above applies to any Python program, not just algorithms and notebooks written on Quantopian. One thing that's unique to Quantopian is that, for historical reasons, we "pre-import" many of our API functions into the namespace of your algorithms and research notebooks.

One of the decisions we made early in the design of Quantopian was to provide many of our API functions as "magic" names that didn't need to be imported. These are generally functions like order or the original history function, which we expected almost all algorithms would want to use, so we decided that it was nicer to "pre-import" these names so that all algorithms didn't have to include the same set of boilerplate imports. We also thought that pre-importing these names might make it easier for newcomers to Python to get started on Quantopian.

In retrospect, however, I think that adding these pre-imported names was a mistake, for two reasons:

While having the pre-imported names makes it easier to write an algorithm that runs, the fact that Python written on Quantopian behaves differently than Python written elsewhere makes it hard for users to build a mental model of how names and imports work in Python. This means that when users run into problems, they often have a hard time distinguishing between normal Python problems and Quantopian-specific problems.
As the Quantopian API has grown over time, it's become infeasible to pre-import every single name in the API into your algorithm/notebook namespace by default. This means that we're now in the strange position that some of our API functions are pre-imported into your algorithm or research notebook, while others are not.

All of the pre-existing magic names can now be imported from quantopian.algorithm (or quantopian.research for notebooks), and for the reasons listed above most posts by Quantopian employees now prefer those names over the old "magic names".

In the interest of backwards compatibility, we haven't removed the built-in names, and I think it's unlikely that we ever will (we may add warnings for using them at some point). I expect, however, that most new additions to the Quantopian API will be added somewhere on the quantopian module and will have to be imported in the same way that everything else in Python needs to be imported. (One notable recent exception to this rule is order_optimal_portfolio, which we added as a new "magic name" because it seemed too confusing to have that be our only ordering function that wasn't pre-imported.)

Hope that helps,
- Scott

Disclaimer

Grant Kiehne

Thanks Scott -

It sounds like kinda-sorta there will be a certain style applied by Quantopian employees, and I should just mimic it in my code (if anything, for consistency with your examples). There will be no performance difference.

One risk, I suppose, with "magic names" is version control. For example, say I use the magic version of order_optimal_portfolio (not imported explicitly in code that I've written). Well, how do I know that I'm not using a different version from the current version available via explicit import (unless I can see how you are applying the magic, to confirm that the code is coming from the same source)?

On a related topic (since I have your attention) which could be for a different forum post, any thoughts on allowing users to define their own libraries and import them? I know that this is problematic in that you rely on backtests having a fixed set of user-contributed code, that becomes part of the backtest data set itself (backtests, in essence, become versions of the code...really ugly, by the way). Of course, there is also Github integration for users. And numerous other potential use-ability discussions that could move things into the 21st century...but I'm getting off into the weeds.

Savio Cardozo

Just coming back on the platform after a little hiatus and the quantopian.algorithm was a mystery to me until I read this - thank you Grant, Scott, and Josh - by the way, love Josh's response or rather the one from the Engineer at Q.

Blue Seahawk

I had wondered also, glad it was mentioned.
Seems there are two schools of thought. For example the following are essentially the same.

In a hurry, obsessed with efficiency:

    schedule_function(trade,   date_rules.every_day(), time_rules.market_open() )  
    schedule_function(records, date_rules.every_day(), time_rules.market_close())

In tuxedo ready for a formal event:

import quantopian.algorithm as algo

    algo.schedule_function(  
        func=trade,  
        date_rule=algo.date_rules.every_day(),  
        time_rule=algo.time_rules.market_open())  
    algo.schedule_function(  
        func=record_some_things_to_the_custom_chart,  
        date_rule=algo.date_rules.every_day(),  
        time_rule=algo.time_rules.market_close())

The latter would wrap-ugly if on single lines.

Vladimir

I belong to the school "obsessed with brevity"
And I would leave only one line from the Blue Seahawk example:

    schedule_function(trade, date_rules.every_day(), time_rules.market_open())

the rest may be done in before_trading_starts.

For me, the "in tuxedo" code is unreadable and therefore inefficient.

You've successfully submitted a support ticket.

Our support team will be in touch soon.