So I tried implementing bath gradient descent for a convex function over L1 unit ball
In C++, which I'm more familiar with, it takes about 800 iterations at learning rate of 0.1 to converge to the minimum
I'm not an expert in Python, but I "translated" the exact same code and with the exact same data, on Quantopian, it takes about learning rate of 999 in order to converge to the minimum at 800 iterations.
I know sometimes learning rate can be >1 depending on the situation, but given how it performed on my computer in C++, there's absolutely no way it should be like that