Computing compounded returns

Hello Philip,

Keep in mind that you only need to call history once, and then use pandas indexing to pick the data you need (see the postings on https://www.quantopian.com/posts/frustrating-experience-with-switching-to-minute-backtests, for example).

I suggest having a look at http://pandas.pydata.org/pandas-docs/stable/computation.html#moving-rolling-statistics-moments. I've never done it, but it appears you could write a custom rolling computation on the dataframe returned by history ("The rolling_apply function takes an extra func argument and performs generic rolling computations."). Right?

Grant

Thank you Grant - I will look into this now, if I figure it out I will post something on here for everyone!

Philipp

Turns out that the pct_change() method might work for you (see http://pandas.pydata.org/pandas-docs/stable/computation.html#moving-rolling-statistics-moments). I've attached a backtest to illustrate. --Grant

import pandas as pd

set_commission(commission.PerShare(cost=0))  
set_slippage(slippage.FixedSlippage(spread=0.00))

def initialize(context):  
    context.secs =   [ sid(19662),  # XLY Consumer Discrectionary SPDR Fund  
                       sid(19656),  # XLF Financial SPDR Fund  
                       sid(19658),  # XLK Technology SPDR Fund  
                       sid(19655),  # XLE Energy SPDR Fund  
                       sid(19661),  # XLV Health Care SPRD Fund  
                       sid(19657),  # XLI Industrial SPDR Fund  
                       sid(19659),  # XLP Consumer Staples SPDR Fund  
                       sid(19654),  # XLB Materials SPDR Fund  
                       sid(19660) ] # XLU Utilities SPRD Fund  
def handle_data(context, data):  
    # get trailing window of daily closes as pandas dataframe  
    prices = history(101,'1d','price')[0:-1] # drop last minute bar in frame  
    pct_change = prices.pct_change()  
    print get_datetime()  
    print pct_change.head()  
    print pct_change.tail()  
    # window = 6  
    # rolling_mean = pd.rolling_mean(prices,6)[window-1:] # drop leading NaNs  
    # std = rolling_mean.std(axis=0) # pandas series (sorted ascending by sid)  
    #  
    # print get_datetime()  
    # print std  
    # print type(std)

Hi Grant,

your help is much appreciated. This is of course very helpful but I still want to figure this one out. I did some experimenting with 'dummy' data in my IDE, hopefully driving me closer to a solution - and I think I am onto something:

import math  
price_history = [147.3, 144.4, 146.3, 141.6, 138.5, 137.2, 136.3]  
def CC_return(numbers):  
    result = []  
    for i in range(len(numbers)):  
        result = math.log(numbers[i] / numbers[i+1])  
        return result  
print CC_return(price_history)

it prints: 0.0198840970099, which is the result of math.log(147.3/144.3).

However, what I was expecting as a result was more something among the result of a list containing: [0.019884, -0.01307, 0.0326] or expressed in formula [math.log(147.3/144.4), math.log(144.4/146.3), ....]

Can anyone explain me why it only printed the result of the final equation?

Best,

Philipp

Hi Philipp,

Not sure what your background is in terms of programming. Have you used MATLAB? Basically, I think you are looking to operate on vectors/matrices, element-wise, which Python supports (numpy/scipy and pandas).

Grant

Hi Grant,

I will look into it - gotta figure this out and yes, I have some matlab experience.

Best,

Philipp

There is a kind of mapping between MATLAB and Python numpy/scipy (e.g. http://wiki.scipy.org/NumPy_for_Matlab_Users). So, that might be a good starting place for you. Pandas and numpy are "friendly" as well, but I don't have a handy reference. In the end, anything that can be done in MATLAB can most likely be done in Python (but for free). --Grant

David Edwards

Phillip, your last code snip has a few things that are keeping it from working.

It defines a results array but redefines it in the for loop
the return is nested in the for loop so it returns the first result calculated.
Your list indices are off. It will throw an error in the last pass of the loop. Remember indexing starts at zero.

Here is a fixed version with explanations.

import math  

price_history = [147.3, 144.4, 146.3, 141.6, 138.5, 137.2, 136.3]  

def CC_return(numbers):  
    # Define result array  
    result = []  
    # Only iterate to len(numbers) - 1 since (i + 1) is used as an index in the loop  
    for i in range(len(numbers) -1):  
        # Use a new variable name inside the loop  
        result_from_this_iteration = math.log(numbers[i] / numbers[i+1])  
        # Append the result to the end of the result list.  
        result.append(result_from_this_iteration)  
    # return the result array outside the for loop  
    return result

print CC_return(price_history)

Thank you David! This works well and I ll try to input this now into my algo! much appreciated - also thanks to Grant I have a couple of more ideas and places to look into to work on my python! :-)

Best,

Philipp

Philipp,

Here's a numpy example (which is the kind of thing that one would do naturally in MATLAB):

import numpy as np

def initialize(context):  
    context.spy = sid(8554)

def handle_data(context, data):  
    price_history = np.array([147.3, 144.4, 146.3, 141.6, 138.5, 137.2, 136.3])  
    print price_history  
    ratio = price_history[1:]/price_history[0:-1]  
    print ratio  
    ratio_log = np.log(ratio)  
    print ratio_log

Since history returns a pandas dataframe, you may want to look into the same kind of approach with pandas directly (I think it is supported).

Grant

David Edwards

That code was more an instructional piece to demonstrate flow control. Grant's solution is more correct, you should look to use numpy/scipy/pandas to take advantage of vectorized computations when possible.