Hello Fawce,
In response to your request for "suggestions for how we should evaluate our population of algorithms and quants" it is gonna depend on the objectives of each algorithm. Basically, each algorithm needs a kind of prospectus, against which its execution and financial performance can be measured. With your demo live algo, you've basically done this by writing an algo to mirror an ETF (https://www.quantopian.com/industries-portfolio). So, it is a straightforward exercise to analyze your demo live algo.
To illustrate my point, you present the fact that "strategies that have been running for 20 days or more, about 75% are profitable" but it does not reveal whether the algos are successful against their objectives. As an (unlikely) example, what if a significant fraction of the algos were designed to hedge against a big market correction, and should not be profiting in the current bull market? Then the fact that 75% are profiting (contrary to their objectives) would indicate failure rather than success.
It might turn out that your customer base is taking a fraction of their investable net worth and intentionally speculating in the market. If only 50% of them profit consistently (relative to an appropriate benchmark), it might still be a win compared to less quantitative approaches to speculating (e.g. manually "day-trading"). Do you have a sense what your live traders are trying to accomplish with their algos, in terms of risk-reward? There's also the case of investors willing to take a hit on return in exchange for reduced volatility. So, it might be a good thing if somebody puts their diversified retirement portfolio into Q/IB and underperforms SPY, for example.
So, you could consider a kind of online guided prospectus template/questionaire that would automatically generate an appropriate set of metrics for a given algo, and then apply them. Then, you could roll them up periodically to see how everyone is doing against their individual objectives.
Grant