Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Computing compounded returns

Hi everyone,

I think my weekend plans have changed since watching the 'live trading' webinar on Thursday :-) So I am getting on with my coding and I have a little bit of experience working in Python. I am aiming to compute the continuously compounded returns of a range of assets (the ETFs under the initialisation function). However, I am completely stuck:

# Put any initialization logic here.  The context object will be passed to  
# the other methods in your algorithm.

# Imports  
import datetime  
import numpy as np  
import pandas as pd


def initialize(context):  
    # Equities  
    context.IVV = sid(21513)  # Benchmark: S&P 500, Core Large-cap US  
    # context.IWM = sid(21519)  # Benchmark: Russell 2000, Small-cap US  
    # context.EEM = sid(24705)  # Benchmark: MSCI Emerging Markets, Large-cap EM  
    # context.EWJ = sid(14520)  # Benchmark: MSCI Japan, Large-cap JP  
    # context.IEV = sid(21769)  # Benchmark: S&P Europe 350, Large-cap European  
    # context.EWG = sid(14518)  # Benchmark: MSCI Germany, Large-cap Germany  
    # context.EWU = sid(14529)  # Benchmark: MSCI UK, Large-cap UK  
    # Fixed Income  
    # context.LQD = sid(23881)  # Benchmark: iBoxx IG Corporates, US  
    # context.HYG = sid(33655)  # Benchmark: iBoxx HY Corporates, US  
    # context.IEF = sid(23870)  # Benchmark: 7-10yr Treasuries, US  
    # context.IEI = sid(33151)  # Benchmark: 3-7yr Treasuries, US  
    context.SHY = sid(23911)  # Benchmark: 1-3yr Treasuries, US  
    context.SHV = sid(33154)  # Benchmark: Barclays U.S. Short Treasury Bond , US  
    # context.EMB = sid(35323)  # Benchmark: JPM EMBI Global Core Index, Global EM FI  
    # Commodities  
    # context.IAU = sid(26981)  # Benchmark: London Gold PM Fix  
    # Volatility  
    context.VXX = sid(38054) # S&P 500 VIX Short-Term Futures  
    # Maximum and Minimum amounts we want the algorithm to go long  
    context.max_notional = 1000000.1  
    context.min_notional = -1000000.0  
# Computing logarithmic returns of assets  
def CC_Returns(datapanel):

Basically, I understand that I can use the history function to pull the historic data for the past x days, but surely there is an easier way then typing out

2d_close = history(2, "1d", "price")
.... 100d_close = history(100, "1d", "price")

and from then on I could calculate:
log(2d_close / 3d_close), ..... log(99d_close / 100d_close)

and store these variables in a list.

How can this actually be translated into useful code? Thank you all for your help!

12 responses

Hello Philip,

Keep in mind that you only need to call history once, and then use pandas indexing to pick the data you need (see the postings on https://www.quantopian.com/posts/frustrating-experience-with-switching-to-minute-backtests, for example).

I suggest having a look at http://pandas.pydata.org/pandas-docs/stable/computation.html#moving-rolling-statistics-moments. I've never done it, but it appears you could write a custom rolling computation on the dataframe returned by history ("The rolling_apply function takes an extra func argument and performs generic rolling computations."). Right?

Grant

Thank you Grant - I will look into this now, if I figure it out I will post something on here for everyone!

Philipp

Turns out that the pct_change() method might work for you (see http://pandas.pydata.org/pandas-docs/stable/computation.html#moving-rolling-statistics-moments). I've attached a backtest to illustrate. --Grant

import pandas as pd

set_commission(commission.PerShare(cost=0))  
set_slippage(slippage.FixedSlippage(spread=0.00))

def initialize(context):  
    context.secs =   [ sid(19662),  # XLY Consumer Discrectionary SPDR Fund  
                       sid(19656),  # XLF Financial SPDR Fund  
                       sid(19658),  # XLK Technology SPDR Fund  
                       sid(19655),  # XLE Energy SPDR Fund  
                       sid(19661),  # XLV Health Care SPRD Fund  
                       sid(19657),  # XLI Industrial SPDR Fund  
                       sid(19659),  # XLP Consumer Staples SPDR Fund  
                       sid(19654),  # XLB Materials SPDR Fund  
                       sid(19660) ] # XLU Utilities SPRD Fund  
def handle_data(context, data):  
    # get trailing window of daily closes as pandas dataframe  
    prices = history(101,'1d','price')[0:-1] # drop last minute bar in frame  
    pct_change = prices.pct_change()  
    print get_datetime()  
    print pct_change.head()  
    print pct_change.tail()  
    # window = 6  
    # rolling_mean = pd.rolling_mean(prices,6)[window-1:] # drop leading NaNs  
    # std = rolling_mean.std(axis=0) # pandas series (sorted ascending by sid)  
    #  
    # print get_datetime()  
    # print std  
    # print type(std)  

Hi Grant,

your help is much appreciated. This is of course very helpful but I still want to figure this one out. I did some experimenting with 'dummy' data in my IDE, hopefully driving me closer to a solution - and I think I am onto something:

import math  
price_history = [147.3, 144.4, 146.3, 141.6, 138.5, 137.2, 136.3]  
def CC_return(numbers):  
    result = []  
    for i in range(len(numbers)):  
        result = math.log(numbers[i] / numbers[i+1])  
        return result  
print CC_return(price_history)  

it prints: 0.0198840970099, which is the result of math.log(147.3/144.3).

However, what I was expecting as a result was more something among the result of a list containing: [0.019884, -0.01307, 0.0326] or expressed in formula [math.log(147.3/144.4), math.log(144.4/146.3), ....]

Can anyone explain me why it only printed the result of the final equation?

Best,

Philipp

Hi Philipp,

Not sure what your background is in terms of programming. Have you used MATLAB? Basically, I think you are looking to operate on vectors/matrices, element-wise, which Python supports (numpy/scipy and pandas).

Grant

Hi Grant,

I will look into it - gotta figure this out and yes, I have some matlab experience.

Best,

Philipp

There is a kind of mapping between MATLAB and Python numpy/scipy (e.g. http://wiki.scipy.org/NumPy_for_Matlab_Users). So, that might be a good starting place for you. Pandas and numpy are "friendly" as well, but I don't have a handy reference. In the end, anything that can be done in MATLAB can most likely be done in Python (but for free). --Grant

Phillip, your last code snip has a few things that are keeping it from working.

  1. It defines a results array but redefines it in the for loop
  2. the return is nested in the for loop so it returns the first result calculated.
  3. Your list indices are off. It will throw an error in the last pass of the loop. Remember indexing starts at zero.

Here is a fixed version with explanations.

import math  

price_history = [147.3, 144.4, 146.3, 141.6, 138.5, 137.2, 136.3]  

def CC_return(numbers):  
    # Define result array  
    result = []  
    # Only iterate to len(numbers) - 1 since (i + 1) is used as an index in the loop  
    for i in range(len(numbers) -1):  
        # Use a new variable name inside the loop  
        result_from_this_iteration = math.log(numbers[i] / numbers[i+1])  
        # Append the result to the end of the result list.  
        result.append(result_from_this_iteration)  
    # return the result array outside the for loop  
    return result

print CC_return(price_history)  

Thank you David! This works well and I ll try to input this now into my algo! much appreciated - also thanks to Grant I have a couple of more ideas and places to look into to work on my python! :-)

Best,

Philipp

Philipp,

Here's a numpy example (which is the kind of thing that one would do naturally in MATLAB):

import numpy as np

def initialize(context):  
    context.spy = sid(8554)

def handle_data(context, data):  
    price_history = np.array([147.3, 144.4, 146.3, 141.6, 138.5, 137.2, 136.3])  
    print price_history  
    ratio = price_history[1:]/price_history[0:-1]  
    print ratio  
    ratio_log = np.log(ratio)  
    print ratio_log  

Since history returns a pandas dataframe, you may want to look into the same kind of approach with pandas directly (I think it is supported).

Grant

That code was more an instructional piece to demonstrate flow control. Grant's solution is more correct, you should look to use numpy/scipy/pandas to take advantage of vectorized computations when possible.

Thank you! got myself a new book on Python (for data analysis), trying to get a bit of a better understanding of python for the purpose of data analysis/manipulation! Philipp