How much does the cost of data impact what might otherwise be an effective machine learning algorithm? For example, if you took the same exact algorithm designed to take as much data as possible, what would the change in return be going from all the free data on Quandl to all the premium data on Quandl? At what point do you look at a machine learning algorithm paper trading on free data and decide to upgrade to some premium data, and how do you decide what data to buy? Ballpark estimates are appreciated; I'm trying to get a feel for how a Wall Street fund might do using only free data.