Notebook

Backtesting a Moving Average Crossover Strategy

In this example we use get_pricing to load 10 years' worth of historical financial data for Apple's stock (i.e. the ticker symbol AAPL).

We then define a Dual Moving Average Crossover algorithm with zipline, the open source backtesting library that powers Quantopian.

Finally, we backtest our strategy against our loaded trade data and create visualization of our entry- and exitpoints.

In [1]:
# Import Zipline, the open source backester, and a few other libraries that we will use
import zipline
from zipline import TradingAlgorithm
from zipline.api import order_target, record, symbol, history, add_history

import pytz
from datetime import datetime
import matplotlib.pyplot as pyplot
import numpy as np
In [16]:
# Load data from get_trades for AAPL
data = get_pricing(
#    ['AAPL'],
    [symbols(24)],
    start_date='2002-01-01',
    end_date = '2015-02-15',
    frequency='daily'
)
data.price.plot(use_index=False)
Out[16]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f6030ebf410>
In [17]:
data
Out[17]:
<class 'pandas.core.panel.Panel'>
Dimensions: 6 (items) x 3303 (major_axis) x 1 (minor_axis)
Items axis: open_price to price
Major_axis axis: 2002-01-02 00:00:00+00:00 to 2015-02-13 00:00:00+00:00
Minor_axis axis: Equity(24 [AAPL]) to Equity(24 [AAPL])
In [18]:
# Define the algorithm - this should look familiar from the Quantopian IDE
# For more information on writing algorithms for Quantopian
# and these functions, see https://www.quantopian.com/help

def initialize(context):
    # Register 2 histories that track daily prices,
    # one with a 100 window and one with a 300 day window
    add_history(100, '1d', 'price')
    add_history(300, '1d', 'price')

    context.i = 0
    context.aapl = symbol('AAPL')

def handle_data(context, data):
    # Skip first 300 days to get full windows
    context.i += 1
    if context.i < 300:
        return

    # Compute averages
    # history() has to be called with the same params
    # from above and returns a pandas dataframe.
    short_mavg = history(100, '1d', 'price').mean()
    long_mavg = history(300, '1d', 'price').mean()

    # Trading logic
    if short_mavg[context.aapl] > long_mavg[context.aapl]:
        # order_target orders as many shares as needed to
        # achieve the desired number of shares.
        order_target(context.aapl, 100)
    elif short_mavg[context.aapl] < long_mavg[context.aapl]:
        order_target(context.aapl, 0)

    # Save values for later inspection
    record(AAPL=data[context.aapl].price,
           short_mavg=short_mavg[context.aapl],
           long_mavg=long_mavg[context.aapl])
In [19]:
# Analyze is a post-hoc analysis method available on Zipline. 
# It accepts the context object and 'perf' which is the output 
# of a Zipline backtest.  This API is currently experimental, 
# and will likely change before release.

def analyze(context, perf):
    fig = pyplot.figure()
    
    # Make a subplot for portfolio value.
    ax1 = fig.add_subplot(211)
    perf.portfolio_value.plot(ax=ax1, figsize=(16,12))
    ax1.set_ylabel('portfolio value in $')

    # Make another subplot showing our trades.
    ax2 = fig.add_subplot(212)
    perf['AAPL'].plot(ax=ax2, figsize=(16, 12))
    perf[['short_mavg', 'long_mavg']].plot(ax=ax2)

    perf_trans = perf.ix[[t != [] for t in perf.transactions]]
    buys = perf_trans.ix[[t[0]['amount'] > 0 for t in perf_trans.transactions]]
    sells = perf_trans.ix[
        [t[0]['amount'] < 0 for t in perf_trans.transactions]]

    # Add buy/sell markers to the second plot
    ax2.plot(buys.index, perf.short_mavg.ix[buys.index],
             '^', markersize=10, color='m')
    ax2.plot(sells.index, perf.short_mavg.ix[sells.index],
             'v', markersize=10, color='k')
    
    # Set figure metadata
    ax2.set_ylabel('price in $')
    pyplot.legend(loc=0)
    pyplot.show()
In [20]:
# NOTE: This cell will take a few minutes to run.

# Create algorithm object passing in initialize and
# handle_data functions
algo_obj = TradingAlgorithm(
    initialize=initialize, 
    handle_data=handle_data
)

# HACK: Analyze isn't supported by the parameter-based API, so
# tack it directly onto the object.
algo_obj._analyze = analyze

# Run algorithm
perf_manual = algo_obj.run(data.transpose(2,1,0))
In [21]:
import pyfolio as pf
In [23]:
returns, positions, transactions, gross_lev = pf.utils.extract_rets_pos_txn_from_zipline(perf_manual)
In [29]:
pf.create_bayesian_tear_sheet(returns, live_start_date='2011-01-01')
Running T model
 [-----------------100%-----------------] 2000 of 2000 complete in 3.3 sec
Finished T model (required 31.63 seconds).

Running BEST model
 [-----------------100%-----------------] 2000 of 2000 complete in 54.8 sec
Finished BEST model (required 90.36 seconds).

Finished plotting Bayesian cone (required 0.81 seconds).

Finished plotting BEST results (required 0.81 seconds).

Finished computing Bayesian predictions (required 0.14 seconds).

Finished plotting Bayesian VaRs estimate (required 0.06 seconds).

Running alpha beta model
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-29-10687a9948a6> in <module>()
----> 1 pf.create_bayesian_tear_sheet(returns, live_start_date='2011-01-01')

/usr/local/lib/python2.7/dist-packages/pyfolio/plotting.pyc in call_w_context(*args, **kwargs)
     44         if set_context:
     45             with context():
---> 46                 return func(*args, **kwargs)
     47         else:
     48             return func(*args, **kwargs)

/usr/local/lib/python2.7/dist-packages/pyfolio/tears.pyc in create_bayesian_tear_sheet(returns, benchmark_rets, live_start_date, samples, return_fig, stoch_vol)
    823     trace_alpha_beta = bayesian.run_model('alpha_beta', df_train,
    824                                           bmark=benchmark_rets,
--> 825                                           samples=samples)
    826     previous_time = timer("running alpha beta model", previous_time)
    827 

/usr/local/lib/python2.7/dist-packages/pyfolio/bayesian.pyc in run_model(model, returns_train, returns_test, bmark, samples, ppc)
    574     if model == 'alpha_beta':
    575         model, trace = model_returns_t_alpha_beta(returns_train,
--> 576                                                   bmark, samples)
    577     elif model == 't':
    578         model, trace = model_returns_t(returns_train, samples)

/usr/local/lib/python2.7/dist-packages/pyfolio/bayesian.pyc in model_returns_t_alpha_beta(data, bmark, samples)
     82         X.loc[:, 'ones'] = 1.
     83         y = data_no_missing
---> 84         alphabeta_init = np.linalg.lstsq(X, y)[0]
     85 
     86         alpha_reg = pm.Normal('alpha', mu=0, sd=.1, testval=alphabeta_init[-1])

/usr/local/lib/python2.7/dist-packages/numpy/linalg/linalg.pyc in lstsq(a, b, rcond)
   1865         work = zeros((lwork,), t)
   1866         results = lapack_routine(m, n, n_rhs, a, m, bstar, ldb, s, rcond,
-> 1867                                  0, work, lwork, iwork, 0)
   1868     if results['info'] > 0:
   1869         raise LinAlgError('SVD did not converge in Linear Least Squares')

ValueError: On entry to DLASCL parameter number 4 had an illegal value
In [27]:
pf.create_full_tear_sheet(returns, positions=positions, transactions=transactions,
                          gross_lev=gross_lev, live_start_date='2011-01-01', bayesian=True)
Entire data start date: 2002-01-04
Entire data end date: 2015-02-13


Out-of-Sample Months: 49
Backtest Months: 107
                   Backtest  Out_of_Sample  All_History
annual_return          0.00           0.02         0.01
annual_volatility      0.01           0.02         0.01
sharpe_ratio           0.71           0.85         0.68
calmar_ratio           0.25           0.37         0.18
stability              0.81           0.33         0.83
max_drawdown          -0.02          -0.04        -0.04
omega_ratio            1.18           1.19         1.19
sortino_ratio          1.01           1.25         0.99
skewness              -0.39          -0.16        -0.15
kurtosis              11.05           7.97        21.21
information_ratio     -0.01          -0.05        -0.02
alpha                  0.00           0.01         0.01
beta                   0.01           0.05         0.02

Worst Drawdown Periods
   net drawdown in %  peak date valley date recovery date duration
0               4.08 2012-09-19  2014-01-30    2014-11-20      567
1               1.54 2007-12-28  2008-10-09    2010-04-22      605
2               1.40 2012-04-09  2012-05-17    2012-08-16       94
3               1.19 2014-11-26  2015-01-16    2015-02-04       51
4               0.79 2011-10-14  2011-11-25    2012-01-06       61


2-sigma returns daily    -0.001
2-sigma returns weekly   -0.003
dtype: float64
Stress Events
                                   mean    min    max
Lehmann                              -0 -0.003  0.001
US downgrade/European Debt Crisis    -0 -0.003  0.003
Fukushima                            -0 -0.002  0.001
US Housing                            0  0.000  0.000
EZB IR Event                         -0 -0.002  0.002
Aug07                                 0 -0.001  0.001
Mar08                                 0 -0.001  0.001
Sept08                               -0 -0.003  0.001
2009Q1                                0  0.000  0.000
2009Q2                                0  0.000  0.000
Flash Crash                          -0 -0.001  0.003
Apr14                                 0 -0.001  0.006
Oct14                                 0 -0.001  0.003
Low Volatility Bull Market            0 -0.001  0.001
GFC Crash                            -0 -0.003  0.002
Recovery                              0 -0.005  0.007
New Normal                            0 -0.009  0.006

Top 10 long positions of all time (and max%)
[24]
[ 0.115]


Top 10 short positions of all time (and max%)
[]
[]


Top 10 positions of all time (and max%)
[24]
[ 0.115]


All positions ever held
[24]
[ 0.115]


Running T model
 [-----------------100%-----------------] 2000 of 2000 complete in 6.0 sec
Finished T model (required 97.92 seconds).

Running BEST model
 [-----------------100%-----------------] 2000 of 2000 complete in 55.3 sec
Finished BEST model (required 125.28 seconds).

Finished plotting Bayesian cone (required 0.81 seconds).

Finished plotting BEST results (required 0.82 seconds).

Finished computing Bayesian predictions (required 0.13 seconds).

Finished plotting Bayesian VaRs estimate (required 0.05 seconds).

Running alpha beta model
 [-----------------100%-----------------] 2000 of 2000 complete in 3.9 sec
Finished running alpha beta model (required 56.54 seconds).

Finished plotting alpha beta model (required 0.15 seconds).

Total runtime was 281.70 seconds.
/usr/local/lib/python2.7/dist-packages/matplotlib/axes/_axes.py:475: UserWarning: No labelled objects found. Use label='...' kwarg on individual plots.
  warnings.warn("No labelled objects found. "
In [ ]: