Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Calculating M.A. and St.Dev. of variables

Hello guys,

I'm coding an algorithm and I need to calculate the SMA and standard deviation of a "factor" I derived from the price of two stocks. Example:

ratio= price(a) / price(b)
ratioMA= M.A. (a) / M.A. (b)
factor = ratio/ratioMA
factorMA= ???
factorST.DEV= ???

I tried deriving them from the SMA and St.Dev of the prices of the stocks but I'm finding some difficulties. Does anybody know the solution?

Thank you in advance.

18 responses

Hi Matteo,

I think you want to use the rolling statistics available in pandas - http://pandas.pydata.org/pandas-docs/dev/computation.html#moving-rolling-statistics-moments

The attached backtest imports the rolling_mean function and calls it on a timeseries of data calculated from price history. Since it uses the history function, it will only run in minute mode.

Is this what you were looking to do?

thanks,
fawce

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thank you for the help John.

However, I'm getting an error while trying to implement the code you provided into my algorithm. I've attached the code to this post so that you can have a look.

Hi Matteo,

I took a stab at plugging Fawce's sample code into your algo and it seems to be working now. Not sure what error you were getting, but if you forget to import the rolling mean function from pandas (like I did at first) you will get an error there.

I've attached my version here.

Best, Jess

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thank you Jess.

So, correct me if I'm wrong: now the algorithm stores 4 days of daily data and calculates the last three days moving average and standard deviation of the "factor"? In order to use minute data I should write 1m instead of 1d in the price_hist variabile?

Thank you in advance.

Matteo

Hi Matteo,

Yes - sorry I noticed the numbers were changed to be shorter than your 10 day example. The first cut of the history() API actually only allows you to ask for down-sampled daily data, '1m' is not supported currently.

I'd like to hear more about your use case for the minutely data - are you looking to take a moving average over several trailing days of minutely data where you basically treat it as one continuous time series?

Best, Jess

As you can see from my code, I'm studying a pairs trading strategy. I'd like to calculate the value of a "factor" every minute and compare it to its moving average, defining entry and exit thresholds base on its standard deviation. If I use the code that you provided, I don't get any error but I'm not sure the algo is working properly because the "factor" variable is defined twice and from the backtests I see that trades are not opened and closed as it should be.

Hi Matteo,

Thanks for catching that - I did have your original logic and Fawce's rolling historical calculation stepping on each other's toes. I've corrected that here.

The one aspect that doesn't currently work the way I think you want it is the moving average and st dev of the derived 'factor'. Because history() currently only works with trailing daily data you can only return a daily factor history as opposed to a minutely series. There are a few more manual ways of brute forcing this, however I think the preferable solution is for us to bump up the priority on supporting '1m' data in history(). I have done that, but don't have an ETA at this time.

Let me know if this looks more like the behavior you were expecting and if you have additional comments, questions or suggestions please let me know!
Best wishes, Jess

Thank you very much Jessica.

This version of the algorithm is excellent. I understand the fact the at the moment I can only store the daily history of variables and I look forward to see future developments. Anyway, even with the daily history, the rationale of the strategy is still intact given that the "factor" is calculated every single minute. The only missing feature is the ability to calculate moving averages and standard deviations based on minute data, but I'll wait for the future upgrades you mentioned.

I've created an algorithm that trades a pairs trading portfolio. However, with some symbols (the least common), it starts trading the wrong sizes. I think there must be some errors in the data for these symbols, but I'd appreciate some comments or confirmation that the problem is in the data and not in the code. Look for example at the pair MSEX/YORW.

Hi Matteo,

This is very cool! You had two main issues causing problems here.

(1) duplicate orders being placed - your algo had a check based on the current position size, if you had 0 position in a pair, you would evaluate the spread and make a trade. This criteria is evaluated once every minute, and for the less liquid stocks there were minutes where orders were placed, but not filled. So you had multiple open orders building up, effectively duplicating the trade you wanted to make many times over.

The fix for this was to put a guard at the start of handle data that continues the algo without trading on a given pair if there are outstanding open orders. For more details on ordering and checking for open orders check out the relevant help docs here.

(2) because of the duplicate ordering above, your algo was applying massive amounts of leverage (hence the crazy $2 and $3m transaction sizes in your prior version). We are aware that we need to make it harder to wind up with unrealistic leverage - but currently you do have to manage this yourself, the backtester let's you borrow forever.

Then lastly, I created a pairs list structure that was more readable (at least for me) for debugging. Take a look and please especially double check that I have correctly mapped over your SIDs and the right directionality for each pair.

Hope this is helpful! Best wishes,
Jess

I really appreciate your support. Thank you.

I tried the code now and it's perfect. The only missing thing would be the ability to specify the thresholds for every pair. I can do it "the long and stupid way" but I'm sure there's a much more efficient way to do this like the one you used in your last post. I was thinking about a specification like the following:
context.upperthreshold = (1.005, 1.02,1.002, etc…) and context.lowerthreshold = (0.995, 0.98, 0.998, etc…), which should be implemented after a "for (stock1, stock2) in context.pairs". Am I right? :D

Matteo,

You could nest loops, but it might be easier to capture all the pairs in one data structure. You can add as many items to the tuple as you wish, so you could add the thresholds as additional elements:

context.pairs = ( (sid(26807),sid(26981), 1.005, 0.995), \  
                  (sid(23853),sid(6935), 1.02, 0.98),  
                  # ...etc  
                  )  
# and add it to the iterator  
for (stock1, stock2, upper_thresh, lower_thresh) in context.pairs:  
    # use in your logic ...  

thanks,
fawce

Thank you John. I did what you suggested and I've uploaded a backtest and a code to prove it. :) However, I've noticed that when adding more pairs to the algorithm, the total amount of transactions doesn't increase at it should be. In other words, when I try the algorithm (for example) with 2 pairs, I get about 400 transactions in a year: when I add other similar pairs (the version I've attached has 15 pairs), the total number of transactions increase only by a small amount (the attached version has 454 transactions). Is there an explanation for this?

Moreover, I was thinking about the two slippage models available in Quantopian backtesting. In this algorithm there are many illiquid securities and I'm wondering which model (and with which parameters) would be the most suitable. I'm getting good results setting the fixed slippage at 0.01 (which is the standard spread for the most liquid securities. I was trying the volumeshare method but really I have no idea about the right parameters to set. Any suggestion?

Hi Matteo - check out this version. There was a bug in the prior version where the algo was not evaluating the second if statement in some cases. I've fixed that here and now I see ~1400 transactions for the same time period.

There was one additional issue I found, which is that two of your pairs had a stock in common. There might be a way to get that to work, but for simplicity in this version I've commented that pair out as the way we are closing positions currently wouldn't work with having one stock be in two different pairs. Let me know if you have questions on that.

I don't have any great suggestions for you on the slippage model, but a next step that would be interesting would be to paper trade this on Interactive Brokers if you have an account there. Then you'd get to see how their model would handle the fills and commissions. Let me know if you have questions on how to do that.

Best,
Jess

Dear Jessica,

I'm paper trading the algorithm at the moment and will update you on the results. The algorithm is perfect now: thank you very much! It's a pity we can't correctly handle pairs with a stock in common, but I'm sure we'll find a solution and, however, that's not a major problem.

Dear all,

I'm testing this algorithm which calculates a RATIO between price and moving average, then calculates a moving average and standard deviation of this RATIO, then calculates an up threshold and down threshold and ENTERS LONG OR SHORT on the underlying based on these thresholds. Would it be possible to modify the algorithm in order to trade a dollar universe instead of just one underlying?

Thank you very much in advance for your help,

Matteo

Hi all,

I'm trying to use the PANDAS rolling mean and st.dev functions on different securities. Any hints on how to modify the above algorithm?

Thanks in advance for your help.

Matteo