Hi,
I went again through the code and I think that the problem actually is already happen well before the definition of the mu. At some point I have to run:
s = pm.MvGaussianRandomWalk('s', cov= sigma_diag, shape=(1000,2))
s_diag = pm.Deterministic('s_mat', T.nlinalg.diag(pm.math.exp(-2*s)))
nu = pm.Uniform('nu', 0, 5)
C_triu = pm.LKJCorr('C_triu', nu, 2)
C = pm.Deterministic('C', T.fill_diagonal(C_triu[np.zeros((2, 2), dtype=np.int64)], 1.))
cov = pm.Deterministic('cov', T.nlinalg.matrix_dot(s_diag, C, s_diag))
The shape of s is (2,2), but the shape of s_diag is (2,) . This also means that cov has then the shape (,). This all is very different to the case of the time-independent standard deviation and I assume that theano has problems in the conversion of a (2,1000) matrix to the diagonal ...
The reason is that you pass in a 3D-covariance (time-changing 2D-cov
matrix) with a 1D-mu, this would need to be 2-D -- one mu-vector per
time-point. You probably also need to specify the shape in that case.
That, however, might be problematic in PyMC3, as higher-dim
multivariate distributions are not well supported
(https://github.com/pymc-devs/pymc3/issues/535#issuecomment-261582945).
It's worth a try though. An alternative would be to loop and create a
single MvNormal per time-point. As that will be excessive, I would bin
time to have one parameter-setting (and one MvNormal) for e.g. 50
data-points, similar as is done in the blog post.
So it feels very likely that we have to loop through the time-points. I am however not sure how to keep the memory of the values of the last point in time. This seems like a very nice feature of the GaussianRandomWalk. Most likely this would have to happen through the prior of the next time-step ?