For those running into memory limitations on the platform, want to use other libraries not supported by Q, or just want to develop locally, here's a brief guide on setting up a local Research environment through the open-source Zipline that powers Q.
Guide
1. Follow the Zipline installation documentation and ideally install it into a separate environment along with the other libraries you need. Note that only up to Python 3.5 is supported.
2. Activate the environment - in Anaconda distribution, this would be:
$ conda activate env_zipline
- Run the following to make sure everything is installed correctly. You should see quandl and quantopian-quandl with no ingestions.
$ zipline bundles
- Ingest the quantopian-quandl data bundle. This will serve as the replacement for USEquityPricing on Q. However, note that only end-of-day prices are built into Zipline. For higher granularity data, you would need a subscription to the Quandl bundle SEP (~$30/month).
$ zipline ingest -b quantopian-quandl
- In your algorithm, the following library references need to be changed:
from zipline.api import *
from zipline.pipeline import CustomFactor, Pipeline
from zipline.pipeline.data import USEquityPricing
from zipline.pipeline.factors import _ # Built-in factors here
from zipline.pipeline.engine import PipelineEngine
This should allow you to run a pipeline locally on your machine, develop a model, etc and then upload the outputs through Q's Custom Data functionality.
(Optional) For Fundamentals data
1. Register on Quandl and find your API Access Key. You would also need a subscription to the Sharadar/SF1 dataset (~$30/month).
2. Clone or download the following repo. Install the libraries quandl, zipline into a new environment with Python 2.X.
$ conda create --name env_alphacompile python=2.7
$ conda activate env_alphacompile
- Navigate to where you unzipped the file, and launch setup.py. Then, open up load_quandl_sf1.py in * ..\envs\env_alphacompile\Lib\site-packages\alphacompiler\data*
- Append your API Key somewhere at the top as an environment variable with
os.environ['QUANDL_API_KEY'] = 'abc' # Put your API key here
- Add a start_date you would like to query from, and change the bottom lines of code into the following. Then, modify the fields list with the fundamental data you need. See the Quandl dataset documentation for values.
if __name__ == '__main__':
BUNDLE_NAME = 'quantopian-quandl'
fields = [] # List of your fields you want to query
num_tickers = all_tickers_for_bundle(fields, BUNDLE_NAME)
tot_tickers = num_tkrs_in_bundle(BUNDLE_NAME)
pack_sparse_data(tot_tickers + 1, # number of tickers in bundle + 1
os.path.join(BASE,RAW_FLDR),
fields,
os.path.join(BASE,FN))
- Open .\alpha-compiler-master\alpha-compiler-master\alphacompiler\util\sparse_data.py. Under init(), change self.data_path='SF1.npy'. Under pack_sparse_data(), replace the df statement with:
dateparse = (lambda x: pd.datetime.strptime(x, '%Y-%m-%d'))
df = pd.read_csv(os.path.join(rawpath,fn),
index_col="Date",
parse_dates=['Date'],
date_parser=dateparse
)
- Launch load_quandl_sf1.py . This will start fetching data via API calls and would take some time.
$ cd .\alpha-compiler-master\alpha-compiler-master\alphacompiler\data
$ python load_quandl_sf1.py
- Open sf1_fundamentals.py and change the fields to the same as in a previous step. Copy and paste the whole alpha_compiler folder with the processed dataset into env_zipline\Lib\site_packages (Python 3 is fine).
- Reactivate the env_zipline environment
- If everything has worked as they should, import the library and Fundamentals should be accessible.
from alphacompiler.data.sf1_fundamentals import Fundamentals
fd = Fundamentals()
# Replace all calls to Fundamentals with fd
# Example: fd.capex
- Make sure you update the fundamentals references in your algorithm (i.e. PE_RATIO) with the same names used in fields in load_quandl_sf1.py. If fundamentals are used in a CustomFactor, make sure to manually specify they are window safe.
class MyFactor(CustomFactor):
inputs = [fd.currentratio]
window_length=1
fd.currentratio.window_safe = True
def compute(self,today,assets,out,value):
out[:] = # Do something
Note: There are certain proprietary Q features such as the universe filter QTradableStocksUS, Risk Model, and Optimize API. You could attempt to replicate this locally, but otherwise I would recommend only moving the Research component locally to develop your model before feeding its outputs back on Q for the actual backtesting.