Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
kalman filter

trying to code my first python class. should I be naming the updateState() method handle_add() instead? I got confused by the function decorators (https://github.com/quantopian/zipline/blob/master/zipline/transforms/utils.py), started googling those, but its a bit late so I'll check it out tomorrow.

Any tips on how to fit this before I run it on out of sample data? I'm not too familiar with the python modules.

Notation (sort of) follows shumway and stoffers book "Time Series Analysis and Its Applications: With R Examples."

import numpy as np  
from zipline.transforms.utils import EventWindow

class Kalman(EventWindow):  
    def __init__(self, mu0, Sigma0, PHIfit, Qfit, Rfit):  
        #initial state guesses are mu0, Sigma0  
        #note we are not assuming Bayesian priors...just fixed values  
        self.X = mu0  
        self.P = Sigma0

        #things that should be from a fit are PHI, Q, and R  
        self.PHI = PHIfit  
        self.Q = Qfit  
        self.R = Rfit

    #some convenience functions  
    def predictState(prevX):  
        return(PHI*prevX)

    def predictStateCov(prevP):  
        return(PHI*prevP*transpose(PHI) + Q) 

    def predictObserv(prevX):  
        return(A*predictState(prevX))  
    def predictObservCov(prevP):  
        return( (A * predictStateCov(predP) *transpose(A)) + R)  
    #this is the guy that gets called every minute  
    def updateState(observedY, X, P):  
        predP = predictStateCov(P)  
        GAIN = predP*transpose(A)* ( (A*predP*transpose(A)) + R).getI()  
        innov = observedY - predictObserv(X)  
        I = np.eye(self.X.shape[0])  
        self.X = predX + GAIN*innov  
        self.P = (I - (GAIN*A))*predP

def initialize(context):  
    context.stock = sid(26578)  
4 responses

Hi Taylor,

The algorithm needs to have a top level handle_data method. So, you could instantiate your Kalman object in the initialize, and then pass updates into it from handle_data. Could you explain the parameters expected by the Kalman class' init and the updateState methods?

You would do something like this:

# assuming the Kalman class above is defined first in the script

def initialize(context):  
    context.stock = sid(26578)  
    # construct the Kalman class  
    context.kalman = Kalman(...)  
def handle_data(context, data):  
    # pass data to Kalman to update state  
    context.kalman.updateState(...)  

It seems like updateState is expecting matrix parameters - you may be able to use a batch transform to build those parameters. The algorithm would look something like this:

# assuming the Kalman class above is defined first in the script

def initialize(context):  
    context.stock = sid(26578)  
    # construct the Kalman class  
    context.kalman = Kalman(...)  
def handle_data(context, data):  
    # pass data to Kalman to update state  
    update_kalman(data, context.kalman)

@batch_transform(refresh_period=1, window_length=20)  
def update_kalman(datapanel, kalman):  
    # do stuff to prices to calculate the parameters for the update  
    kalman.updateState(...)  

thanks,
fawce

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

:::So, you could instantiate your Kalman object in the initialize, and then pass updates into it from handle_data.

Right. Just worried about defining the class right now.

:::It seems like updateState is expecting matrix parameters

Yeah. And batch transforms return pandas panels, right? So inside update_kalman() I'd have to define some auxiliary pandas stuff as matrices, then make the call to updateState, and then it returns more panels objects? Damn. You guys made the right choice using panels, but state space models are a bit difficult without matrices.

This is your @batch_transform function decorator, no? I may ask you about this later.

def batch_transform(func):  
    """Decorator function to use instead of inheriting from BatchTransform.  
    For an example on how to use this, see the doc string of BatchTransform.  
    """

    def create_window(*args, **kwargs):  
        # passes the user defined function to BatchTransform which it  
        # will call instead of self.get_value()  
        return BatchTransform(*args, func=func, **kwargs)

    return create_window  

::: Could you explain the parameters expected by the Kalman class' init and the updateState methods?>

State space models have a latent variable equation, and an observed variable equation. X is the latent state (think of it as a filtered true price/price vector), and P is the covariance of that filtered price/price vector. Both of these are conditional on all the observations (Y) that have been seen. The notation is a bit deprecated. The Kalman filter algorithm updates these two quantities at every minute. The way it does this is by using the Kalman filter equations. Derivation of these equations requires Baye's rule, and that theorem about the distributions when you condition bits of a joint gaussian vector on itself.

In the latent equation, X is assumed to be Markovian. Phi is its transition matrix. It's assumed to be fixed here. Q is the error term.

R is the error term in the observation equation. A would've been the matrix that transforms the the X into the Y, but I think I forgot to define it.

Hi,

Batch transforms receive a pandas panel, which is a keyed set of dataframes. The dataframe has a method to convert to a numpy matrix (doco here).
Your batch can return anything you wish, so you could return the matrix, or some output from the Kalman class.

thanks,
fawce

maybe something like this? not sure I understand everything completely.