Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
All That Glitters Are Bugs! A Concrete Notebook Example

I was preparing this algo for the new contest so all the settings are updated including new Fixed Basis Slippage, QTradableStocksUS, and risk_loading_pipeline. Ran a four year backtest and results seem to have potential. Ran a full roundtrip tearsheet and the evaluate backtest procedure to see compliance. Below is link to image of backtest, followed by full tearsheet:
Image of Backtest

53 responses

And here are the results of the evaluation of compliance. It failed in 5 out of 9 tests! I need to do some further digging....

Checking positions concentration limit...
FAIL: Max position concentration of 20.16% > 5.0%.

Checking leverage limits...
FAIL: Leverage range of 0.75x-1.19x is not between 0.8x-1.1x.

Checking turnover limits...
PASS: Mean turnover range of 10.39%-22.71% is between 5.0%-65.0%.

Checking net exposure limit...
FAIL: Net exposure (absolute value) of 32.82% on 2015-11-11 > 10.0%.

Checking beta-to-SPY limit...
FAIL: 99th percentile absolute beta of 0.43 > 0.3.

Checking sector exposure limits...
PASS: All sector exposures were between +/-0.20.

Checking style exposure limits...
PASS: All style exposures were between +/-0.40.

Checking investment in tradable universe...
FAIL: Investment in QTradableStocksUS of 0.00% on 2014-01-03 is < 90.0%.

Checking that algorithm has positive returns...
PASS: Cumulative returns of 1.30 is positive.

Results:
4/9 tests passed.

Score computed between 2016-01-05 and 2018-01-19.
Cumulative Score: 6.122867

And this is the block of algo code that deals with Optimize API and its constraints:

    # Setup Optimization Objective  
    objective = opt.MaximizeAlpha(predictions_top_bottom)

    # Setup Optimization Constraints  
    beta_neutral = opt.FactorExposure(  
        context.beta.dropna(),  
        min_exposures={'beta': -MAX_BETA_EXPOSURE},  
        max_exposures={'beta': MAX_BETA_EXPOSURE},  
    )  
    constrain_gross_leverage = opt.MaxGrossExposure(1.0)  
    constrain_pos_size = opt.PositionConcentration.with_equal_bounds(  
        -0.05,  
        +0.05,  
    )  
    market_neutral = opt.DollarNeutral()

    if predictions_top_bottom.index.duplicated().any():  
        log.debug(predictions_top_bottom.head())

    sector_neutral = opt.NetGroupExposure.with_equal_bounds(  
        labels=context.risk_factors.Sector.dropna(),  
        min=-0.2,  
        max=0.2,  
    )

    # Run the optimization. This will calculate new portfolio weights and  
    # manage moving our portfolio toward the target.  
    order_optimal_portfolio(  
        objective=objective,  
        constraints=[  
            constrain_gross_leverage,  
            constrain_pos_size,  
            market_neutral,  
            sector_neutral,  
            beta_neutral,  
        ],  
    )  

MAX_BETA_EXPOSURE = 0.3

James, thanks for sharing this example. Looking at the results, I have a few comments and questions:

General
It looks like there might be a data issue ~April 2017. Have you drilled into the positions on those days and see if anything jumps out as a data error? If not, would you be willing to create a support ticket and grant our support team permission to look at the backtest?

Position Concentration
It looks like you only have 1-3 positions that are outside the 5% limit. If there's a data error in 2017, I'm wondering if this is related. I'd recommend looking into when your algorithm had its largest holdings and seeing if you can pinpoint the reason. On top of a data error, you should consider your rebalance frequency. For example, if you rebalance once per month, you might simply be holding names that go up or down in value enough to push them about the 5% limit. You could try supplying a stricter constraint to order_optimal_portfolio (maybe 3%?) to see how it affects the result.

Leverage
On the lower end, it looks like your portfolio takes a bit of time to get to 1x leverage. Is that expected? I'm wondering if you can create a rule for your algo to get into its initial portfolio a little more quickly. On the upper end, it looks like the spike in 2017 is pushing the algo over the 1.1x limit. Again, I'm interested to know what's causing the sudden jump in returns.

Dollar Neutral
In 2017, it looks like the dollar exposure jumps with the sudden jump in returns. However, it looks like there's a bigger drift between long and short holdings at the end of 2015. One reason that this could occur would be a slower rebalance schedule. Another could be that your orders for short positions are getting canceled. I'd be interested to hear your thoughts on this.

Beta-to-SPY
It looks like this algo is pretty consistently positively correlated with the market. Most of the time, it seems to be within the limit of 0.3, but I'd recommend iterating on your alpha factor in research to see if it's inherently correlated to SPY. Of course, the issue might be tied to the long exposure, so I'd suggest looking into that first. As mentioned in this post, we're working on content to help you with beta analysis in the research step.

Tradable Universe
Unfortunately, without seeing the code, it's hard for me to dissect this one. Given that the failure is reporting 01/03/2014 as being 0% invested in the QTU, my first instinct is that the algo is slow to enter its initial portfolio, which may not be properly handled by the notebook. I'll have to investigate further. Same question as above on this: would it be possible for the algo to enter into its initial portfolio more quickly?

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Let's tackle the easy ones first. This failure:

Checking investment in tradable universe...
FAIL: Investment in QTradableStocksUS of 0.00% on 2014-01-03 is < 90.0%.

My base universe is set to QTradableStocksUS. On 2014-01-03, the first trading day of the algo, upon checking the transactions that day, I found out all trades were done with ETFs something not allowed in the contest. So why did the algo execute these trades when my code instructs it to trade only the QTradableStocksUS. Let's call this Bug #1, the cockroach type!

These next two failures are somewhat related:

Checking positions concentration limit...
FAIL: Max position concentration of 20.16% > 5.0%.

Checking leverage limits...
FAIL: Leverage range of 0.75x-1.19x is not between 0.8x-1.1x.

Again we've asked Optimize API to constraint these two to be within the contest thresholds. Apparently, it did not accomplished its job. But deeper digging in the Transactions section reveal that somewhere around 3/15/2017and 3/17/2017, something very strange happened with a particular stock, symbol "LLEX", an energy company. Its price was at $0.13 at 3/15/2017 and shot up to $4.50 on 3/17/2017, made over $3 million that day! I thought maybe there was a reverse split that was not adjusted by the data provider or Quantopian so I checked Yahoo historical data and there was no reported reverse split. So I have to conclude that this is a price reporting bug by Quantopian's data provider. But since it is potentially a two pronged bug (Optimize and price data) let's call Bug #2, the Beetle type .

The next failure is:

Checking net exposure limit...
FAIL: Net exposure (absolute value) of 32.82% on 2015-11-11 > 10.0%.

Wow,32.82% on on 2015-11-11, I checked the Transactions on that day it has longs of about $970K and shorts of about $496K on 910 transactions. I'm thinking maybe a lot of partial fills but I filtered for top 200 market cap. But again, why didn't Dollar Neutral constraint handle this well? So let's call this bug, the fly type.

The last failure:

Checking beta-to-SPY limit...
FAIL: 99th percentile absolute beta of 0.43 > 0.3.

I noticed while observing the algo run, beta was high in the beginning and slowly tapered off until its final ex post value of 0.24. Again, why did beta neutral constraint not take care of this? I've already said a lot about this beta constraint, so for now let's just call this bug, the mosquito type!

So there you have it. I'd rather have the community and the Q staff comment and give feedbacks on these findings. I'm exhausted already!

@Jamie, the strategy trades leveraged ETFs. Stuff like DUST and NUGT. Also trades ETNs, which, if I recollect, should not part of the QtradableStocksUS universe.

LLEX had a 1:10 reverse split 24/06/2016 which explains the huge jump in 2016.

@Jamie , thanks for your quick response. You beat me to the punch, just posted my findings. Please read and recomment. I would specially like you to comment on why Optimize API doesn't seem to be doing a good job in constraining to contest threshold.

@Guy, thanks for your feedback. On the first day of trading 1/3/2014, the algo traded all ETFs which I know is not allowed in the contest. I have set my based universe to trade only QTradableStocksUS, so why did it do this? This one is definitely a code bug. If in fact LLEX had a reverse split on 24/06/2016, why was it only reflected on 3/17/2017? Date of announcement vs. date of effectiveness? Still aren't we supposed to be working with adjusted price to have apples to apples comparision? This one is definitely a price reporting bug.

@James: Any chance you could share your backtest + code so I can dig into the issues? If you created a support request, grant permission on the backtest, and mention that you'd like me to take a look at it, I can go through our support channel to see what I can dig up. Unfortunately, I'm fairly limited to asking questions instead of answering them without seeing the code. Most of your findings are correct, but I don't yet agree with the conclusion that the Optimize API is failing. The behavior could have several possible explanations (I mentioned some of them, like trade frequency, above). I'll need to dig into the code to figure out exactly what's going on.

@Jamie, at this point I'd rather not because as I mentioned in my opening that this algo is potentially my contest entry, after I sort out the kinks. But the basic code is a clone from "Machine Learning Part III". The only changes I made were:

1.change base universe to QTradableStocksUS
2. change alpha factors with my 3 secret sauces
3. added beta costraint to Optimize API (as shown in blockquote above)
4. set all constraints to match contest thresholds.(as shown in blockquote above)
5. No. of training days to 126
6. No. of forward predict days to 1.
7. Almost forgot, change type of Classifier

While you're at It, would also like to also direct you and other Q staff to my other findings in these threads : possible-implementation-bug-with-beta and beta-constraint-in-risk-model-totally-unnecessary

@Karl, thanks but no thanks on your suggestions. You like me to give you a tutorial on Optimize API and debug your overfitted code?

@Karl, again with your bad habit of deleting posts. On your PS of your just deleted post, you got to be kidding me, on where the parameters of Optimize API should be or is "missing". Look and stare at the code which is the Q model code. You're just sourgraping because I've schooled you several times. Be a man, man!

@James: Here are some of the first steps that I would recommend taking to debug:

  1. Look at the objective function being passed to order_optimal_portfolio on day 1. Does it include the ETFs? If so, there might be a bug in the pipeline, or the construction of the objective function.
  2. Figure out the cause of the jump in returns in March/April 2017. If you look at the positions on the day before and the day after the jump, you will likely be able to find a single asset causing the problem. If that's the case, let me know which it is, what the date of the price jump (or drop) is, and roughly how big the change is.
  3. If you are calling order_optimal_portfolio less frequently than daily, try to determine if your constraints are failing the day of/after the call to order_optimal_portfolio, or if the portfolio is drifting and failing the constraints later in the week/month.

Remember, the Optimize API does not guarantee that your portfolio will satisfy all of the contest constraints. Here are a few important properties to know about using order_optimal_portfolio.
- The constraints supplied to order_optimal_portfolio are not applied on a continuous basis, they are applied once when the algorithm places its orders.
- The constraints supplied to order_optimal_portfolio are based on historical data, which can vary in predictiveness for different metrics.
- Portfolio-level metrics like beta-to-SPY are controlled with per-asset constraints, which tends to have more variance in success.

There is no guarantee that constraining your portfolio on historical data means that it will adhere to the constraint in perpetuity. Take an extreme, but simple example where you have a TargetPortfolioWeights objective specifying a target of 50% in asset AAAA and 50% in BBBB. Let's say you supply a max position constraint of 50% to order_optimal_portfolio. Let's assume when order_optimal_portfolio is called, your orders fill exactly at the price at which the order was placed, and you get 50% AAAA and 50% BBBB in your portfolio. The next day, the price of AAAA doubles, and now your portfolio is 67% AAAA and 33% BBBB. In this example, we supplied a constraint of 50% max position concentration which was respected when we placed our orders. However, the next day, our portfolio exceeded this limit since we didn't re-check the status of our portfolio. This is something that needs to be handled by the algorithm.

While the example might be a bit extreme, I'm trying to demonstrate that the risk metrics of a portfolio aren't solely controlled by the Optimize API. To account for something like price drift, I'd recommend leaving a cushion on certain constraints like position concentration (try specifying something less than 5%). If your algorithm checks its signal on a weekly basis, you might still want to use calculate_optimal_portfolio on a daily basis to make sure you don't drift too far from a portfolio that lives within the constraints of the contest.

Finally, on the other thread about beta, my take is that you've identified that supplying a constraint based on historical, per-asset beta is frequently not a great way to predict tomorrow's portfolio beta. We're working on a content piece that analyzes beta-to-SPY at the research step of an alpha factor, which might be a better way to control portfolio beta. I think there are also a series of experiments that can be done to see if there are other per-asset constraints that can be supplied to constrain portfolio beta. You mentioned the similarities between beta-to-SPY and dollar neutrality, so there may be some algos that simply don't need a separate constraint for beta. Of course, there may still be some algos that do benefit from some sort of extra beta constraint. One thing you could try is defining a slightly different per-asset beta that is partly based on a prior assumption that assets have a beta-to-SPY of 1. There's an example of this in the long-short equity lecture example. There's not necessarily one solution that applies to every algo, so I think it's worth experimenting and figuring out what works best for your case.

@Jamie, notwithstanding everything that was said in prior posts, there is a “bug” in the tear sheet with the round-trip calculations.

You have a strategy doing some 42,658 trades with an average net profit per trade of –$41.98 resulting in a loss of: –$1,790,806.99. How can the charts show a positive return? The charts or the calculations are right, but not both!

@Guy, nice catch! Yeah, so which is right the chart or the PnL report? Head scratch!

@Jamie, aside from Guy's recent catch . Let's breakdown what we know so far:

  1. Price reporting bug on LLEX , price on 3/15/17 is $0.13 and shot up to $4.50 on 3/17/2017, The 10:1 reverse split was not adjusted by your data provider. I don't know if Q internally does a data integrity check of data provider's data. If not then I think you should.

  2. On algo trading ETFs , I don't really know but it could well be on my end. I'll double check again.

  3. On Optimize API's ability to constrain effectively, the jury is still out on this. As you illustrated above in your two stock example, the disconnect is in Optimize API's inability to see the ex post factos and thus sometimes results in disparities. As there are going to be black swan events now and then , perhaps the solution is relaxing the restrictions in the thresholds to account for these occurences. If for example leverage thresholds is violated only once or twice a year, then it can still pass the test.

  4. On the other thread, I think exposing beta calculation to be user configurable is a huge problem. Once you let the user change regression lengths of beta calculation, then the beta threshold becomes relative to its window length, I don't think this is the intent.

@James, I did a bit of digging on LLEX. It looks like the stock split occurred in 2016 as LLEX was filing for NASDAQ relisting. Unfortunately, we do not get data for OTC stocks, so our pricing data looks like it has a bit of a gap between when the stock was delisted and when it was relisted. This is a limitation of the platform.

The good news is that LLEX wasn't in the QTU (see the bottom part of the attached notebook), so once you figure out the issue with the universe, the data issue should go away with it.

  1. The contest requirements will have some protection against outlier events, but the algo you shared in this post still won't meet some of the criteria. We'll have more on the exact outlier guarding in the next couple of weeks.

  2. To be clear, the portfolio beta risk metric is not defined by the algo author. The same definition is used for all backtests. I'm suggesting that authors can decide what is best used to control the beta of their algorithm (I'm not sure if we disagree on this one, or if there's just a misunderstanding).

@Jamie, thanks for your response. The issue of ETF and OTC trades could well be on my end, just have to find it.

On no.2 above, going by the Q model code, beta calculation is done in the pipeline and regression length can be changed by the author, i.e . your default is 260 which I could change to 5 . The result of this calculation is picked by the beta constraint in Optimize API and checked against the set min/max beta exposure. Changing window lengths of beta will also change the distribution of betas threfore the beta threshold of +- 0.3 now becomes relative to the window length, and I don't think this is the intend. I suspect there is also another beta calculation inside Optimize API. If this is true, then the question now is which one takes precedence and how is the interplay between these two calculations affecting constraining beta? I hope I made my point clearer.

@ Jamie -

We're working on a content piece that analyzes beta-to-SPY at the research step of an alpha factor, which might be a better way to control portfolio beta.

As you suspect, building a better beta will require constructing a beta factor that is predictive across the QTradableStocksUS. To this end, it would be interesting to understand how good SimpleBeta is in predicting beta, and for which stocks it works, and for which it falls apart. For example, there is evidence that for high-volatility stocks, it is a poor predictor:

universe = (  
        AnnualizedVolatility(mask=QTradableStocksUS())  
        .percentile_between(85, 100))  

It would be interesting to know the predictability of beta versus an arbitrary "slice" of the QTradableStocksUS. Additionally, my intuition at this point is that tightly controlling beta to zero (by picking an appropriate "slice" of the QTradableStocksUS) will make it very challenging to do much better than the risk-free rate, over the long term, on average. In other words, if the universe is selected solely for the purpose of obtaining beta ~ 0, it won't have enough variability for profitable trading. Trying to control beta to zero and at the same time making an uncorrelated profit, on a portfolio of 100-150 stocks, may be near impossible; a portfolio of stocks with predictable betas won't be so "tradable."

I'd also be interested in your comments on computing beta using OLS versus TLS. The potential pitfall in applying the former is regression dilution.

Finally, regarding the QTradableStocksUS constraint, as I understand, the composition changes versus time. So is the test being applied point-in-time? For example, if the algo doesn't exit stocks that are no longer in the QTradableStocksUS, will it fail the constraint?

Jamie -

You are welcome to look at this, as well (code published for the masses):

https://www.quantopian.com/posts/new-contest-entry-trial

What is the standard tear sheet to be using? Perhaps you could run it, and attach it there.

Hi Grant, went out on a hot date yesterday? Hope you enjoyed your brief respite. Anyway, you have raised some very good and interesting points above and would like to hear about the outcome of Q's research and tests regarding these.

@Jamie, I took a look at Q documentation on Optimization and quote below some snippets of it with emphasis being bold:

order_optimal_portfolio(objective, constraints)
Calculate an optimal portfolio and place orders toward that portfolio.

Parameters:
objective (Objective) -- The objective to be minimized/maximized by the new portfolio.
constraints (list[Constraint]) -- Constraints that must be respected by the new portfolio.
universe (iterable[Asset]) -- DEPRECATED. This parameter is ignored.
Raises:
InfeasibleConstraints -- Raised when there is no possible portfolio that satisfies the received constraints.

UnboundedObjective -- Raised when the received constraints are not sufficient to put an upper (or lower) bound on the calculated portfolio weights.
Returns:
order_ids (pd.Series[Asset -> str]) -- The unique identifiers for the orders that were placed.

Debugging Optimizations
One issue users may encounter when using the Optimize API is that it's possible to accidentally ask the optimizer to solve problems for which no solution exists. There are two common ways this can happen:

There is no possible portfolio that satisfies all required constraints. When this happens, the optimizer raises an InfeasibleConstraints exception.
The constraints supplied to the optimization fail to enforce an upper bound on the objective function being maximized. When this happens, the optimizer raises an UnboundedObjective exception.
Debugging an UnboundedObjective error is usually straightforward. UnboundedObjective is most commonly encountered when using the MaximizeAlpha objective without any other constraints. Since MaximizeAlpha tries to put as much capital as possible to the assets with the largest alpha values, additional constraints are necessary to prevent the optimizer from trying to allocate "infinite" capital.

Debugging an InfeasibleConstraints can be more challenging. If the optimizer raises InfeasibleConstraints, it means that every possible set of portfolio weights violates at least one of our constraints. Since different portfolios may violate different constraints, there may not be a single constraint we can point to as the culprit

One symptom that code implementation is broken is when the purpose of the code does not attain its objective or some operation of the code is not triggering when it's suppose to.

In the above notebook example, on day one of trading the algo, right of the bat, violated the QTradableStocksUS constraint. Shouldn't the Optimize API via the order_optimal_portfolio operation halted and raised the InfeasibleConstraints Error, saving the user and Q computational resources? The documentation says it should but it doesn't. This is part of my reasoning why I suspect that from the perspective of code implementation, Optimize API may not be doing its job properly and efficiently.

@James: Unfortunately, there's no way to supply QTradableStocksUS as a constraint to order_optimal_portfolio. Instead, your portfolio will be selected from the universe of stocks defined in your objective function. I suspect that if you look at predictions_top_bottom in your code, you might find names that aren't in the QTU. Looking at that variable is the best place to start debugging this behavior.

On your point above about beta, there is no other beta constraint computed by Optimize. Optimize only knows about historical/current pricing and volume data plus the objective and constraints you supply. Even the beta_neutral constraint is defined as a general FactorExposure constraint.

by the Q model code, beta calculation is done in the pipeline and regression length can be changed by the author, i.e . your default is 260 which I could change to 5 . The result of this calculation is picked by the beta constraint in Optimize API and checked against the set min/max beta exposure.

You're right that you can modify the beta definition that you use in the constraint. The way you should think about this is that we are asking the community to write algorithms that meet specific criteria. You can achieve these criteria however you'd like. To help you achieve these, we build and publish tools for you to use. Some of the tools we published are strictly required, because we don't believe it's reliable to meet the constraint without using it (e.g. QTU). For others, we suggest possible use cases, but ultimately leave it up to you how you want to achieve the criteria. I think the position concentration example is a good example of this. There's a PositionConcentration constraint that you can use to control your position concentration, but you can choose the bounds, the frequency at which you check the constraint (how frequently you call order_optimal_portfolio), and more. This is no different for beta-to-SPY.

@Jamie,

Unfortunately, there's no way to supply QTradableStocksUS as a constraint to order_optimal_portfolio. Instead, your portfolio will be selected from the universe of stocks defined in your objective function.

Ok, I can understand that there is no way to suppy QTU as a constraint. But from a code implementation perspective and efficiency, you can do this:

    # Run the optimization. This will calculate new portfolio weights and  
    # manage moving our portfolio toward the target.  
    algo.order_optimal_portfolio(  
        objective=objective,  
        constraints=[  
            constrain_gross_leverage,  
            constrain_pos_size,  
            market_neutral,  
            sector_neutral,  
            beta_neutral,  
        ],  
    )  
#Pseudo code after each call to order_optimal_portfolio to account for contest criteria that is not in Optimize API  
if stock selection is not >= 90% of QTU then halt operation and give Error Code: "did not meet QTU criterion"  
or if Mean Daily Turnover < 5% or > 65% of Trailing 90 days then halt operation and give Error Code: "did not meet Daily Turnover criterion".  

Isn't this more efficient, code flow wise?

@Jamie,

On your point above about beta, there is no other beta constraint computed by Optimize. Optimize only knows about historical/current pricing and volume data plus the objective and constraints you supply. Even the beta_neutral constraint is defined as a general FactorExposure constraint.

Are you sure about this? How come when I comment out beta calculation in pipeline and comment out the beta constraint in Optimize API, beta is reflected in performance graph and report. It must be calculating it somewhere, don't you think?

You're right that you can modify the beta definition that you use in the constraint. The way you should think about this is that we are asking the community to write algorithms that meet specific criteria. You can achieve these criteria however you'd like. To help you achieve these, we build and publish tools for you to use. Some of the tools we published are strictly required, because we don't believe it's reliable to meet the constraint without using it (e.g. QTU). For others, we suggest possible use cases, but ultimately leave it up to you how you want to achieve the criteria. I think the position concentration example is a good example of this. There's a PositionConcentration constraint that you can use to control your position concentration, but you can choose the bounds, the frequency at which you check the constraint (how frequently you call order_optimal_portfolio), and more. This is no different for beta-to-SPY.

Let me put in the way the hedge fund industry interprets a beta neutral strategy. When a manager says he deploys a beta neutral strategy, the measure of beta is based on industry standards which is one year daily returns. No manager will say I deploy a beta neutral strategy based on a five day daily returns. Tweaking beta as a measure of beta neutrality is not an option because it is standardized. This is what I mean when I say that if you change regression lengths of beta then the measurement now becomes relative vis a vis the change in window lengths.

@James, I think you are forgetting one thing.

Some might like to use the Optimize API outside the contest arena where it would be their responsibility to choose whatever objective functions they might like with whatever type of restrictions they want.

For instance, I do not like the minimum turnover constraint. I think no minimum should be used. If you can generate some alpha with as low a turnover as possible, all the better for you. Less churning, less commissions, less fees. But, then, I am not forced to participate in any contests.

If you look at the 5% concentration criteria, it is easily satisfied. A portfolio with 20+ stocks would do the trick. However, flying so close to the limit will have you exceed it on an outlier. Therefore, you should pad your selection with some leeway: go for 50+ stocks. I do not like the 5% concentration either. But then again, no one is forcing me to participate in a contest.

I find the Optimize API a most interesting tool to have. You set your objective functions, give it your constraints, and voilà, it outputs the best portfolio weights to meet those objectives.

The real problem are not the constraints. Those may be anything you want. You are just telling your payoff matrix to behave within such and such limits. If there is a solution, the Optimize API will give it to the best of its abilities.

The real problem is in the objective functions. These are functions “you” supply that are intended to be surrogates for profit generation. You want to provide predictive price functions to a blob of price variance where most, on Quantopian, ignore that this swarm is highly unpredictable from day to day. Most often, those predictive functions trade on market noise.

If these functions had predictive value, it would show, and in a big way. They would be able to produce more than the risk-free rate, and maybe, most importantly, exceed market averages: E[R(m)], generate some real alpha.

@Guy, you are absolutely correct that optimization (not Optimize API per se) is a very useful tool in trading systems development. I use it a lot in my work specifically, Covariance Matrix Adaptation Evolution Strategy optimization and Particle Swarm Optimization while Optimize API uses what you call convex optimization, and there are many other flavors of optimization one can choose from. However, in the context of this thread, I was honing in on the contest criteria and the effectivity, efficiency and code implementation of Optimize API.

Guy, your catch on the "bug" on PnL reporting is such a great find and it is quite disappointing that nobody from Q seem to be addressing this or at the very least acknowledging it and say that they are/or will be working at rectifying it. Cheers!

@James, yes, concerning the “bug”. They should address the problem head on. One thing we do need in all these simulations is to have the ability to trust the data and the outcome.

I would add to my previous post that even if I do not like the 5% concentration restriction, in a contest, I would easily live with it. In my kind of scenario, it would not matter much over a short-term contest interval. It is over the long run that I would be more permissible.

@James: The beta that is used to create the constraints for order_optimal_portfolio is defined per asset. The portfolio beta that you see in a pyfolio tearsheet is not computed from that input. It is computed by regressing the portfolio returns (simulated using Zipline) against the returns of SPY. That code lives in pyfolio and empyrical, so the definition of beta used to compute the beta of a backtest is independent of whatever you supply to order_optimal_portfolio.

@Guy: I apologize, I forgot to respond to your bug report. Indeed the round-trip numbers look wrong. After doing some digging, it appears as though spinoffs aren't properly accounted for in the round-trip analysis, but I'm not sure if that's the only issue. I filed an issue in pyfolio repo here. Thanks for reporting it.

@Guy,

One thing we do need in all these simulations is to have the ability to trust the data and the outcome.

Spot on, Guy, this exactly what I'm trying to do when I'm decomposing different parts of the whole algorithmic flow of code to make sure everyone is playing a leveled field particularly in reference to the contest. That it cannot be easily "gamed" and / or program logic ,data and outcomes can be trusted.

@James, we should not constrain the program logic. It is part of the whole purpose of doing all these simulations. Design whatever trading strategy you want based on whatever decision process you want, even as weird as you want them.

However, whatever you may use, show that it has some value, generate some positive alpha. And, the only way to do that, is to show what your simulation, under “normal” conditions, could do it. By normal is meant, using a reasonable stock selection universe on whatever criteria where all the trades were indeed feasible over the simulation interval. Going forward is another matter altogether.

If we all use the Optimize API and the “QTU”, then that is a leveled playing field. That is what the contest's constraints are designed to do.

As to all the objective functions that we may design, they are all admissible. Even some “voodoo” stuff if we can program some. And, if your “voodoo” stuff is productive, and you win the contest, then all I would say is: bravo, you won, fair and square.

@Jamie, I'm talking about the beta number you see in the performance graph after a backtest under Results Overview. Is this calculation done by pyfolio and empyrical which is totally independent of beta calculation in author's code with no interaction or interplay at all. If this is right, then it is the proper way. However, you don't seem to have a grasp of why beta calculation should not be user configurable because as a measure of beta neutrality and by industry practice and standards this is fixed at one year daily returns, so that managers and investors alike are comparing apples to apples.

This is not to say that you cannot use beta calculation that is user configurable, if you use it as a direct alpha factor or in relation to an alpha factor. But not as a measure of beta neutrality or as a variable in beta constraint. And this is why I call it an implementation bug.

@Guy, when I say program logic what I meant was in the context of code programming logic. I think what you meant to say and correct me if I'm wrong, is we should not constrain trading logic. This I totally agree with you, I've seen traders make money on astrology or following moon cycles!

@James: Yes, the backtest beta is generated in Zipline, which uses empyrical under the hood. I am not disagreeing with the fact that the beta metric should not be user configurable. I am just saying that it's not user configurable. The definition of portfolio beta is fixed in empyrical and is independent of any algorithm code, just like all of the other backtest metric definitions like returns, Sharpe, max drawdown etc.

@Jamie, unless this is a typo error:

I am just saying that it's not user configurable.

But it is user configurable in the pipeline, as I said earlier, I change SimpleBeta regression length from 260 to 5 and its results is the one being evaluated by beta constrain in Optimize API which is also user configurable. What am I missing?

@James: The beta metric reported in pyfolio or at the end of the backtest is not user configurable.

Creating a factor in pipeline, regardless of the name or definition, does not change the definition of the backtest or pyfolio risk metrics, it just gets you data that you can use in your algorithm to make trading decisions. SimpleBeta is just another Pipeline factor. Using it in your algorithm does not change any backtest metric definitions.

You can then use that definition of SimpleBeta as a constraint in Optimize, which may influence the portfolio that your algorithm trades. But this does not change the definition of the backtest or pyfolio risk metrics.

No matter what factors you use in Pipeline or Optimize, your backtest beta-to-SPY metric will be based on this definition from empyrical, which only takes portfolio returns and benchmark returns as input.

@Jamie, this is where you are wrong, the output of SimpleBeta in the pipeline is picked up in Optimize API via the beta constraint where it is evaluated by the settings of your desired thresholds : minimum / maximum beta exposures and this in turn reports to the evaluation algorithm to see if it complied with contest threshold of +-0.3. The beta calculation in Zipline just gives you the final beta of the algo by regressing the portfolio returns against the returns of SPY. and does not play any role in determining whether beta complied with the contest thresholds.

@James: The evaluation notebook uses the backtest result object (first line in the first cell), which uses the metrics from zipline/empyrical.

Optimize does not report to the research notebook. Optimize converts an objective and a series constraints into a set of orders for the algorithm.

@Jamie,

Optimize converts an objective and a series constraints into a set of orders for the algorithm.

And these set of orders is with the objective not only to Maximize Alpha but also stay within bounds of contest thresholds which is the reason for constraints in the first place. Were you involved in coding the algorithm?

Hi James -

The Optimize API is more-or-less described by Scott on https://www.quantopian.com/posts/request-for-feedback-portfolio-optimization-api. It has been in the works for awhile, and as I understand, it is all based on solving what's called a convex optimization problem. I think it uses CVXPY which is also available to Quantopian users for their algos directly; the Optimize API is a kind of API wrapper.

Hi Grant,

Thanks for the reference. I am very familiar with optimization techniques as I use them a lot in my work. Optimization is basically a search algorithm based on different assumptions of what the solution space shape is. Convex optimization is based on the assumption that the solution space is shaped like a convex when minimizing the objective or concave when maximizing. This is very efficient in linear models and both global and local optimal solutions are one and the same. I happen to belong to the school of thought that the financial markets are nonlinear and nonstationary that is why I employ nonlinear optimization techniques like genetic/evolutionary algorithms, i.e. Particle Swarm Optimization. In nonlinear optimization the global optimal solution is not necessarily the local optimal solution because there are multiple solutions possible. But all optimizationis are structured the same way, you have an objective (can also be multiple objectives) which are subject to constraints.

James -

Sorry, I didn't realize you were so familiar with optimization (more so than I am, no doubt).

One thing I'm realizing about this beast is that generally, if the Optimize API is doing a lot of heavy lifting, it is probably a bad thing. In fact, it seems that if the algo output (the combined "alpha factor" i.e. portfolio weights, per the workflow) doesn't basically conform to the constraints, there is a problem. I get the sense that the ideal algo would not need the Optimize API at all, with its multitude of diversification/risk-managing constraints.

@ Jamie - please see attached for yet another example. It is an algo that, to obtain a beta ~ 0 over a 2-year time frame, requires that I set the beta constraint in the Optimize API to:

MIN_BETA_EXPOSURE = -0.3  
MAX_BETA_EXPOSURE = -0.3  

If I don't set this bias, I get a beta ~ +0.2 for a 2-year backtest (see attached). I'm offering this up in the hope that someone can shed some light on this mystery. It seems counter-intuitive that things could be so outta whack; it does appear to be an existence proof that the Optimize API, as implemented, is not effective in controlling beta on a forward basis (I agree with James that this is a "bug" in some respects, in that the Optimize API isn't even close to doing what one would expect it to do--barf when it realizes that the beta "constraint" isn't being met).

@Grant, there might be nothing wrong with the Optimize API when you set the min and max beta to -/- 0.30. It is just that the combined impact of the other constraints have precedence. Something like the dollar-neutral thing will force an equilibrium on exposure, and by itself limit the boundaries of the beta excursions. In fact, you will be forcing the portfolio beta to approach zero from either side.

Try the same code while only changing the beta min and max to -/- 0.90. You will get the same results you did. Yet, the API again complied to your request. It did not exceed the -/- 0.90 just as it did not exceed the -/- 0.30.

You are providing more than one constraints, and it is the impact of all of them that you see, not just one.

@Grant, just in case you were wandering, setting the min and max beta to +/+ 0.90 in your scenario with also give the same results.

Hi Guy,

If I set:

MIN_BETA_EXPOSURE = 0.9  
MAX_BETA_EXPOSURE = 0.9  

and uncomment out the constraint:

    beta_neutral = opt.FactorExposure(  
        loadings=pipeline_data[['beta']],  
        min_exposures={'beta':MIN_BETA_EXPOSURE},  
        max_exposures={'beta':MAX_BETA_EXPOSURE}  
        )  
    constraints.append(beta_neutral)  

I get:

InfeasibleConstraints: The attempted optimization failed because no portfolio could be found that satisfied all required constraints. The following special portfolios were spot checked and found to be in violation of at least one constraint: Target Portfolio (as provided to TargetWeights): Would violate FactorExposure(['beta']) because: Exposure to 'beta' (0.000902370515674) would be less than min exposure (0.9). Current Portfolio (at the time of the optimization): Would violate FactorExposure(['beta']) because: Exposure to 'beta' (0.0) would be less than min exposure (0.9). Empty Portfolio (no positions): Would violate FactorExposure(['beta']) because: Exposure to 'beta' (0.0) would be less than min exposure (0.9).

So, it is doing something.

I suspect what's going on is that the trailing beta indicator is not a forward beta indicator, for some universes of stocks. A more predictive beta indicator is required. It does make me wonder how the beta would be controlled for real. Using a trailing 1-year window and regressing against SPY doesn't seem like it'll cut it. I'm thinking there should be an easy way to adapt Alphalens to see what is going on, since to write a humdinger Quantopian algo, I think beta needs to control beta better without a hack.

@Grant, yes, agree.

@Grant,

Did you intentionally set min / max beta exposure to be equal at 0.9? This is called the equality constraint. I'm gald you did because this is the first time I've seen the InfeasibleConstraints being triggered. In one of my posts above where I quoted Q Optimize API,documentation, I asked Jamie why this InfeasibleConstraints error is not being triggered when beta violates constraint settings? I never really got an answer. The error can not be more clearer as it states: InfeasibleConstraints: The attempted optimization failed because no portfolio could be found that satisfied all required constraints

So my question now is why was it triggered when this equality constraint was violated and not when inequality constraints (i.e +- 0.3) is violated?

So my question now is why was it triggered when this equality constraint was violated and not when inequality constraints (i.e +- 0.3) is violated?

@ James -

The likely explanation here is that the Optimize API is working as designed. It takes in data, applies its optimization routine, and returns a result. The disconnect is that for beta, the data provided is stale. If you have a look at the help page, the constraint is simply:

(new_weights * loadings[column]).sum() >= min_exposure[column]
(new_weights * loadings[column]).sum() <= max_exposure[column]

The problem is that the risk control problem is being solved in the past not the future. In the case of beta, we are using the beta computed from a 1-year trailing window, so it ends up being a the beta from ~ 6 months ago. So, if the constraints can be satisfied using stale betas, then no exception will result. However, it doesn't mean that the constraint is met point-in-time (i.e. it might fail with the current stock betas). So, when the algo returns are regressed against SPY, it appears that the risk control is not working; I suspect it is, but it is just trailing risk control (unless a beta-bias is imposed, as I did, but I'm probably applying look-ahead bias...shameful, but I'm hoping it persists for my Contest 38 submission where beta ~ 0 should be a score booster).

If you Google "beta forecasting" you'll see that this is nothing new under the sun. For example, see the conclusions in:

https://forecasters.org/wp-content/uploads/gravity_forms/7-2a51b93047891f1ec3608bdbd77ca58d/2014/07/Reeves_Jonathan_ISF2014.pdf

This paper demonstrates that when reliable higher frequency returns are available, these
will deliver more accurate one-month-ahead beta forecasts, relative to forecasts from re-
turns measured at a lower frequency.

It is easy enough to try--write a version of SimpleBeta that will take in minutely returns, and then use the betas in the Optimize API.

@Grant,

Let's take the case of your baseline algo, the code follows all Q specs and was able to achieve zero beta but failed in Position Concentration. Position concentration is a point in time variable and is not influenced by ex post factos like beta and leverage can. Why did Optimize API not flag this InfeasibleConstraints: The attempted optimization failed because no portfolio could be found that satisfied all required constraints, point in time?

I guess I wouldn't read too much into which constraints fail and which pass. The optimization takes all constraints into account as a whole, and either finds a solution or doesn't.

@Grant,

OK if that's the way you want to interpret it.

I did something to your algo that would may it pass all constraints except the style exposure:
Checking style exposure limits...
FAIL: Mean short_term_reversal exposure of 0.912 on 2017-03-06 is not between +/-0.40.

This I expected because your style is purely mean reversion. I don't want to publish it here because it will show how one can exploit the vulnerabilites of Q code implementation and thus be able to "game" it. But since this is your code and you seem to be the only listening to what I'm saying, I can share it with you via private email. Do you want me to share it with you and you decide what you want to do with it?

Yes, I'm aware of the short_term_reversal style risk. If you found a gaping hole that would allow gaming, I suggest publishing it and letting Q know.

@Grant,

In your baseline algo where you sucessfully achieved zero beta you were able to pass style exposure but not position concentration, am I right?

I don't want to publish it, let them ask me then I will.

Here's the notebook of the "hack":

Yep. The whole point of the risk model style constraint is to kill this type of algo that relies heavily on short_term_reversal.