New Tool For Quants: The Quantopian Risk Model

Pardon me my naive question, but does that mean that an allocatable algorithm should have as little exposure to the sector and style risk of any sort as possible and as large as possible specific returns?

If so, would an alorithm based (solely) on quantitative value investing principles be automatically disqualified, for example?

Tim: That's a great and very relevant question (and not naive at all). In a perfect world, algos have zero exposures at all times and are pure alpha. In reality, risk models are just models and as such they suffer from estimation noise and other assumptions (biases) built into them. Quantopian's risk model is no different. As such, making your algorithm have as little exposure as possible might dilute your alpha to the point where it's not existent anymore. Risk should be recognized and within reasonable bounds.

It also matters which time-period we're talking about. An algorithm that dips into a long energy exposure on one day and then goes short energy for some short period is fine (within a budget). It might even be timing that factor. However, an algorithm that has a consistent long energy exposure is more problematic. As with market beta, investors do not pay you to follow the market. As such, they also don't pay you to recreate an energy ETF.

So yes, algos that purely recreate existing factors will likely not receive an allocation and returns should be mainly driven by specific returns. But that doesn't mean that common risk exposures need to be zero at all times.

Disclaimer

Burrito Dan

For transparency, please can you share the CustomFactor you are using for Short Term Reversal and other common factors beyond the ones given in Lecture 34.

Cheng Peng

@Burrito Dan
Seconded. This would be helpful when trying to constrain and reduce common factor risk.

Leo M

@Thomas is it possible to list an example(s) of backtest (with risk model figures) that is(are) NOT okay for allocation.

With so many lines in the graph it is difficult to understand what we need to keep within bounds, although I can understand at a higher level that the graphs list exposure to various risk factors and you don't want algorithms that take extraordinary and consistent risk to one factor at all times an example will help a lot to capture the requirements.

Along with recent enhancements to fundamentals and this feature managing risk in algorithms I truly believe it has elevated Quantopian platform to new heights. Great job!

@Leo: I would encourage you to run this experiment: Create an algorithm that goes 100% long XLE the whole time, analyze it with the perf attribution and post the resulting NB.

@Burrito Dan: We plan to publish the details of the risk model successively, together with the factors. I'm not sure the exact specification is helpful as it allows to work around the specific definition of the factor, without reducing the actual exposure. Happy to hear how I'm wrong and why it would be useful.

Disclaimer

Josh Payne

@Dan
Also, this isn't a direct answer to your question but you will likely find Scott's post from this afternoon helpful in understanding and exploring the risk model: https://www.quantopian.com/posts/introduction-to-the-quantopian-risk-model-in-research

As Thomas mentioned, we're working on more educational materials on this topic in the coming days and weeks.

Many thanks
Josh

Disclaimer

das q

@Thomas
Knowing the risk factor specification is important to understand how your alpha factors are aligned with the risk model. If you define a factor identically to me and I want to take a bet on it, I'm happy to take that bet, I should probably drop it from the risk model specification. If you define it only slightly differently, say you standardize differently, and I don't drop your factor, I'm going to only be betting on the differences, which may in fact be noise (or even worse magnifying data errors). If I understand how you define your factor and I believe mine is better, I'm happy to neutralize to your factor to not take the "common factor" bet. Without knowing what went in to the computation though, I can't determine how different our factors are.

All the standard providers on the market include specifications of how these factors are computed (i.e. value = 50% p/b + 50% p/e, size=log(mktcap))

Do you guys plan on making the risk model data available for download? That would be incredibly useful to explore.

Dan Dunn

@das - thanks for that answer. Others here at Quantopian might have follow ups for you.

On the risk model data - go to Scott's post, clone the notebook, and look at the section "Working With the Risk Model in Pipeline." Please let us know if that is the access you're looking for, or if there is more that you want.

Disclaimer

Burrito Dan

@Thomas I think the specification is useful to avoid constructing something similar by accident and then flail around trying to work out why. It also helps with learning and open debates about factors, as per AQR and Kenneth French.

Leo M

Nov 9, 2017

Hi @Thomas, I have attached a notebook that you asked for. I added 0.5 XLE and 0.5 AAPL to the algorithm and ran a backtest over last two years. Could you please let me know how to read the graphs.Is it possible to create a summary table of noticeable (and beyond threshold) risk factor violations of the algorithm. In my opinion this should be automated or highlighted in DARK RED bold face graph line. Also help in reading the graphs will be much appreciated. Are the scales like 1,2 etc on the Y Axis logarithmic. Thanks.

Julien D

Nov 9, 2017

Well done, are you planning on realeasing the code outside of the Quantopian platform for use offline a la Zipline?

Dan Dunn

Nov 9, 2017

We haven't decided yet on our open source plan for this project. We definitely will be sharing a lot more information about the risk model and how it is constructed - particularly where it makes the model more useful to the community.

One option we are considering is to open source the code itself. That will give some insight about the model. The struggle at that point will be the data. Without the underlying data, it's harder to make use of the model.

Disclaimer

Cheng Peng

For motivation, I've attached a constrained strategy. Did take me a while though.

Leo M

Can we get detailed documentation on the various graphs.

I have mentioned this before as well, as much automation as possible in identifying and presenting red-flag violations of thresholds. A person should not be required to sort through fifteen lines in a graph, code should automatically highlight severe violations of factor/sector risk exposures that happened in any year. Thanks.

@ Thomas -

Regarding your comment:

I'm not sure the exact specification is helpful as it allows to work around the specific definition of the factor, without reducing the actual exposure. Happy to hear how I'm wrong and why it would be useful.

You are basically saying that your risk model may not work so well--if it is "game-able" then it also implies that it may not capture risks that are actually there (e.g. an algo that actually uses short-term reversal may not be flagged as such by the tool). Maybe if you are more transparent, rather than being concerned about users gaming the system, users would help you improve the tool. Or am I misinterpreting your comment?

@Grant: No, that was my comment. Upon thinking about it further though and taking your comments as well as Dan's into account I'm convinced otherwise.

Disclaimer

Regarding short-term reversal as a risk, how does one reconcile that with Rob's statement on https://www.quantopian.com/posts/enhancing-short-term-mean-reversion-strategies-1 :

For the majority of quant equity hedge funds that have holding periods on the order of a few days to a couple weeks (“medium frequency” funds), by far the most common strategy is some variation of short-term mean reversion. Of course, while no hard data exists to support this claim, in my experience working alongside several dozen quant groups within two multi-strategy hedge funds, and admittedly only seeing aggregate performance or individual performance that was anonymized, I was able to observe a strong correlation between hedge fund performance and the returns to a simple mean-reversion strategy (for example, buying the quintile of stocks in the S&P500 with the lowest five-day returns and shorting the quintile of stocks with the highest five-day returns). When the simple mean-reversion strategy was going through a rough period, the quant groups were almost universally down as well.

It sounds like it works, or why would it be so popular? Is Quantopian wanting to differentiate from the hedge fund pack?

You seem to be sending the message that short-term mean reversion is a non-starter with Quantopian, yet Rob did a whole study on it, published it on your site, it was highlighted as a nice example, etc. This seems like inconsistent messaging. What am I missing?

Peter Harrington

@Dan Dunn if you wanted to open source this and were worried about data you could use the data from the Quandl Wiki bundle and NASDAQ sector codes, both are freely available. The NASDAQ data also includes market cap, and the fundamental data can be obtained from Sharadar on Quandl.

Edit: Sharadar has free fundamental data here.

Leo M

Can someone explain what the cumulative common returns and cumulative specific returns are and what are preferred chart patterns for them? The post at the top indicates preference for high specific returns and low common returns, while total seems to be the sum of the two. The common returns are the sum of the 17 (sector and style risk exposures) and they need to be minimized or bounded?

Also I would like to know what the Y axis numbers 0.1/0.5 etc represent. Besides the intuition that higher numbers and outliers for common risk are negative for allocation, documentation section as soon as possible in the wiki help pages on what are acceptable thresholds will help a lot. A lecture on this will be very helpful as well!

Nov 11, 2017

The risk model consists of a series of cascading linear regressions on each asset. In each step in the cascade, we calculate a regression, and pass the residual returns for each asset to the next step.

I'm trying to understand what's going on here. It seems that there are some underlying assumptions about how the various alpha factors combine, i.e. the model for returns. I'm imagining:

alpha = w1*alpha1 + w2*alpha2 +...wN*alphaN

So, the analysis tool finds the weights for a subset of N alphas, and attributes the rest to specific return (due to alphas not in the risk model)?

Generally, it would be nice to have some background here. Is the Quantopian risk model unique in the industry? Based on any published underlying theory? Etc. It just got plunked out, without any justification and description of its reasonableness and limitations. If you are taking it to be proprietary and confidential, then just say so. Otherwise, it would be helpful to understand it better.

@ Thomas - regarding sharing code and specification details, if I were a customer, I'd want access. And I'd want to be able to run a backtest on the algo in which I'm investing (presumably you can combine licensed algos in this fashion). And then run my own tear sheets, draw my own independent conclusions, and decide how much money to invest. Will customers be able to do this sort of hands-on analyses?

Nov 11, 2017

@ Leo M Yes. The cumulative common returns are the returns obtained by all of an algo’s exposures in common factors (sector factors and style factors). The cumulative specific returns are the returns not explained by the common factors. An algo’s return depends on both the common factor return and specific return (algo return = common factor return + specific return).

The main function of those figures is helping you understand your algo and improve your algo. For example, where do the returns of my algo come from? Are my returns just mainly driven by common factors instead of my novel idea? Do I unintentionally have some very large exposure to technology sector? I think after you play the performance attribution with your algos for a while, and you could tell me what chart pattern is good and you like.

From my personal opinion, I do not prefer to have my algo’s returns mainly driven by common factors. If my algo’s returns are mainly driven by common factors, why do I spend time on writing it and trade it with transaction fees? I could buy some corresponding ETFs and hold, right?

Also, collecting the info from all these three figures would provide more insights. Even more, using the research platform to generate some customized figures would be interesting and useful. https://www.quantopian.com/posts/introduction-to-the-quantopian-risk-model-in-research

About the number in y axis, I am not sure which figure’s y axis you mean. If it is the one from the first two figures, it is just the return we used to describe pnl in the percentage sense. If it is the one from the last figure, it is factor exposure. It describes the relationship between a dependent variable (algo returns) and explanatory variable (factor returns). For example, if the factor is the market factor, this factor exposure is just the “beta” we are familiar with.

We will release more documents about the risk model and performance attribution step by step in the near future.

Nov 11, 2017

@ Josh -

It would help put things in context if you were to run a backtest on your 1337 Street Fund (the Q Fund) and posted the risk assessment result (obviously, a standard performance tear sheet would be interesting, too). Is this feasible? Or would you have regulatory constraints or otherwise?

Leo M

Nov 12, 2017

@Rene thanks for detailed explanation.

Leo M

Nov 12, 2017

@Rene as to your statement "If my algo’s returns are mainly driven by common factors, why do I spend time on writing it and trade it with transaction fees?", how would we know what mainly is if it is not defined or identified by the code. Mainly can be many different things for 160k people looking at their algorithm. Some clarity from automation (on what Q believes is mainly) will I believe help in characterizing violations (even if you classify them broadly as low, medium, high from your perspective) that will help too.

Leo M

Nov 12, 2017

@Rene the example of XLE 100% that you and Thomas gave in this post are extreme cases. The chances of anyone writing that algorithm and actually getting an allocation is practically zero (because that kind of strategy will never satisfy your strict filters on max drawdown, sharp and volatility). What I would be interested in is if I did not select a style factor to code my strategy and I am using order_optimal_portfolio to manage my sector and beta and position concentration based on allocation page thresholds, am I still likely to hit any risk factor by accident that are severe enough infractions to get the algorithm automatically disqualified. If there is a high level summary that suggests such a risk exists that knowlege itself will help one to dig further and look at the algorithm further and investigate.

While I agree that using style factors like momentum, mean reversion, sector rotation etc. are viable industry popular strategies there are tons of other algorithms where there is no intent to ride those factors, and a clarity that we are not accidentally riding those factors (beyond Q thresholds) to realize our returns is valuable information in itself.

@Leo: Yes, those examples are contrived to make a point. It certainly is possible to pick up risk exposures unintentionally, that's the main purpose of the risk model is to uncover that. So yes, if your strategy, by accident, picks up large exposures it is less likely to receive an allocation. We have not quite yet nailed down where the exact thresholds are (they won't be overly constraining). The purpose of the risk model release is to get us thinking about exactly the questions you are raising and give users a a different way to look at their algorithms in relation to broader market forces.

Disclaimer

@Grant Kiehne : Good question! I’d like to quote another paragraph from Dr. Rob Reider’s post.

Given how common short-term mean-reversion strategies are, and more importantly, how well and consistently these strategies have held up over the years, it’s worthwhile to consider ways to enhance the performance of a simple mean-reversion strategy. Of course, every quant fund has their own way of implementing a mean-reversion strategy, which they often refer to as their “secret sauce”. In this post, I’d like to offer some possible ingredients that the community can use to create their own secret sauce. “

As far as I understand, Quantopian is interested in the short term mean-reversion strategy, but Quantopian is not interested in the simple mean-reversion strategies. We want the algos with novel ideas (i.e. the “secret sauce” mentioned in above paragraph).

To be clear, the style factor Q risk model used is called short-term reversal, which is defined as the return differences between stocks with strong losses to reverse (recent loser stocks) and the stocks with strong gains (recent winner stocks) to reverse in a short time period. The idea of mean-reversion strategy to me is the prices or returns eventually move back toward the mean or average eventually. The short term reversal means a quick reversal from strong gain or loss in short term.

In other words, if an algo is just simply betting on the quick reversal from strong gain or loss, the Q short term reversal factor would catch it. If an algo is designed as a mean-reversion algo with “secret source” in it, the Q short term reversal factor would help to check if it is real “secret source” or not.

@Leo M Yes. Different people have different opinions about “mainly”. I could not write it explicitly, since algos are different. Suppose you and I, we wrote an algo together, how do we decide what “mainly” means? I would check a few things first.

If the specific returns do not make money in the backtest and OOS test. Even if the common factor return is great, we should not invest money on it. Here, “mainly” is 100%.

If the performance attribution suggests specific returns can make money, I would think about the commission fee and slippage cost. If the specific returns is 1%, but the transaction cost is 2%. We should not invest money on it. Here, “mainly” is still 100% since the transaction fee is considered.

If the specific returns are still positive after taking out the transaction cost, I would check what the volatility of the specific returns. If the Sharpe ratio of the specific returns is lower than the Sharpe ratio of some common factor returns. I would suggest us not to invest on this strategy. We buy the common factor ETF with high Sharpe ratio. Here, to answer what “mainly” means need to take some calculation and thinking already.

@Grant Kiehne I am not sure if I understand your question correctly. About analyzing “alpha factors”, Max will have a lecture for analyzing alpha factors soon!

The idea of risk model is based on a factor model (Q has a good lecture about factor model, https://www.quantopian.com/lectures/factor-risk-exposure). It uses some variables as factors. These variables are subjective, but sectors, styles, countries, etc. are typically used. In the Q risk model, it decomposes each asset’s returns into the returns of different risk factors. Let me use a simple example with only 2 assets to show the idea.

Returns of AAPL = aapl_factor_exposure_1 * sector_factor_returns_1 + aapl_factor_exposure_2 * sector_returns_2 + … + aapl_ factor_exposure_11 * sector_returns_11  + aapl_factor_exposure_12 * style_factor_returns_1 +  aapl_factor_exposure_13 * style_factor_returns_2 + … + aapl_factor_exposure_16 * style_factor_returns_5 + aapl_residuals

Returns of IBM = ibm_factor_exposure_1 * sector_factor_returns_1 + ibm_factor_exposure_2 * sector_factor_returns_2 + … + ibm_ factor_exposure_11 * sector_factor_returns_11  + ibm_factor_exposure_12 * style_factor_returns_1 +  ibm_factor_exposure_13 * style_factor_returns_2 + … + ibm_factor_exposure_16 * style_factor_returns_5 + ibm_residuals

In the performance attribution, it combines the above info with EOD holding positions. Suppose you have 50% AAPL and 50% IBM on day 1, your algo’ return on day 1 can be written as

Algo return = (50% * aapl_factor_exposure_1 + 50% *  ibm_factor_exposure_1) * sector_factor_return_1 + (50% * aapl_factor_exposure_2 + 50% *  ibm_factor_exposure_2) * sector_factor_return_2 + … + (50% *  aapl_residuals + 50% * ibm_residuals)

The common factor return =  (50% * aapl_factor_exposure_1 + 50% *  ibm_factor_exposure_1) * sector_factor_return_1 + (50% * aapl_factor_exposure_2 + 50% *  ibm_factor_exposure_2) * sector_factor_return_2 + … (50% * aapl_factor_exposure_16 + 50% *  ibm_factor_exposure_16) * style_factor_return_5

The specific return = Algo return - The common factor return.

About how the general risk model and performance attribution works, please see https://www.quantopian.com/lectures/risk-constrained-portfolio-optimization. About the details of Q risk model, we will release step by step. The goal of the Q risk model is for you to understand your algos, manage the risk, and improve your algos. We would like to find the best way to show you how to use the risk model to improve your algos without showing you too many math details at once about how to build a risk model that may distract attention and not super useful.

Leo M

@Thomas, thanks for clarifying.

@Rene. Naked eye is good to see the most egregious violations. When you have so much data that you are plotting in a graph with 17 lines, a program can analyze those datapoints more thorougly using such mathematical formulas as mean, median, standard deviation and many more hypothesis and derive a standard intelligent summary (based on intuition that you can translate to algebraic formula or a multi variable equation) and those deductions will in most cases be better than human judgement. Not going to add more to this. I think I have said too many things already in this thread. But probably I have conveyed what I am trying to say.

Rene -

Thanks for the feedback and references. The lecture https://www.quantopian.com/lectures/factor-risk-exposure appears to provide some details on what you are likely doing under the hood.

I think if I'm interpreting correctly, if my algo's alpha is (a, b, & c constants):

alpha = a*momentum + b*short_term_reversal + c*momentum*short_term_reversal

where momemtum & short_term_reversal are exactly as you've defined them in your code, then it would fail miserably for an allocation (presumably, you account for interactions, as I've shown).

However, if I found some more clever way of combining the factors (ML or perhaps based on some additional information?), then maybe it would be o.k. (assuming everything else is good, like returns, Sharpe ratio, beta to SPY, etc.)?

Tony Morland

Hi @Grant, @Rene,

Search for meaning:
I understand that mathematically it may be reasonable to expect cross-terms for any quadratic or higher-order relationships but, in the equation:
alpha = a*momentum + b*short_term_reversal + c*momentum*short_term_reversal
while the meaning of the a*momentum and b*reversal terms are clear, what meaning exactly do you ascribe to the cross-term: c*momentum*reversal ?

@Leo M Thanks a lot for all of your questions. We are working on improving the performance attribution tearsheet for users to read and understand it easier. For now, the research platform is the best answer - it is very powerful. You can use it to generate customized figures and analyze the risk hidden in your algo.

About getting allocation, your algos will be compared with other users’ algo. We would choose the most novel algos with highest Sharpe Ratio and low common factor risk etc. You may want to find the “optimal status” of your algo first, and then later use it in the contest.

We haven’t set official thresholds or defined a “risk box” that is our answer of what we want. We are enjoying the community conversation about the new risk model, and we’re thinking about what the best way to define what we want is. The thresholds I, personally, suggest you to try on your algos is listed for your reference are here. But please take them as a first draft, nothing more

Sectors: factor exposure ~< 5%
Short-term reversal: factor exposure ~< 50%
All other Style factors: factor exposure ~< 20%

*measured by 95% percentile of the absolute value (daily factor exposures).

I hope these numbers are just helping you to understand what the profit resource of your algo is and what the risk resource of your algo is. I truly do not want this numbers limit you to find the “optimal status” of your algo.

Please trust that we think your questions are very valuable to us and the community, so keep them coming!

Disclaimer

Leo M

@Rene, thanks for your feedback and some guidance on thresholds. Excellent work on getting this feature in the platform.

@ Grant Kiehne:

Thanks for your question! The momentum factor Q defined is “return differences between stocks on an upswing (winner stocks) and the stocks on a downswing (loser stocks) over 11 months.” The short-term reversal factor Q defined is “return differences between stocks with strong losses to reverse (recent loser stocks) and the stocks with strong gains (recent winner stocks) to reverse in a short time period”.

Both of the momentum factor and short-term reversal factor are just commonly used in risk model and very simple factors. Just to be clear, short-term reversal factor is not equivalent to short-term mean reversion algo.

Whether the alpha you used is fine, i need know a few answers in advance.

What does the “momentum” you defined here is? Is it just betting the winner stocks simply measured 11 month returns?
What does the “short-term reversal” you defined here is? Is it just betting the quick reversal from strong gain or loss in price, like using relative strength index?
What do your alpha returns look like? Do it make profit on the “momentum” or “short_term_reversal” or “momentum*short_term_reversal”? What risk do you take from them separately and together?

I do not see any problem with cleverly combining the good alpha factors. Actually, Max is going to have a lecture for checking whether the alpha factors are true “alpha” very conveniently later.

There is a webinar on Thursday. If you send the tearsheet to [email protected], and register for the webinar, you can get a live analysis of your algo from our Managing Director of Research, Dr. Jess Stauth.

Disclaimer

Nov 15, 2017

Hi Rene -

Regarding my expression:

alpha = a*momentum + b*short_term_reversal + c*momentum*short_term_reversal

The momentum and short_term_reversal factors would be identical to those used in your risk model. Presumably, when the algo is analyzed with your risk model tear sheet, it would see very strong momentum and short_term_reversal risks (and the total risk due to momentum and short_term_reversal would be the same, regardless of whether c = 0 or not). Correct?

In other words, adding interaction terms of the form above won't help mitigate risk, correct?

Nov 16, 2017

@ Rene -

If you could elaborate on your statement above, it would be helpful:

Just to be clear, short-term reversal factor is not equivalent to short-term mean reversion algo.

Regarding your comment on thresholds:

Sectors: factor exposure ~< 5%
Short-term reversal: factor exposure ~< 50%
All other Style factors: factor exposure ~< 20%

In the end, I think you need to normalize the risk, relative to what you would like to see in individual algos that would be accretive to the 1337 Street Fund (projecting forward at least 6 months, since you need that much time for an algo to run out-of-sample). A simple 3-level stop-light indicator would be nice: green (o.k.), yellow (needs for improvement), red (a non-starter).

I gather that there are certain dynamic dials that fund managers/CIOs would use to regulate a fund. For Quantopian and its 1337 Street Fund of funds, I'm supposing that the risk model is one of those dials, that will be adjusted periodically. How do you intend to manage revisions of the risk model, and how often (e.g. quarterly?)? Will users be notified of upcoming changes?

Personally, with regard to Quantopian, I'm in total information overload; there is way too much complication. It's all wonderful, I suppose, if one is a quant professional, with appropriate training and 10 years of industry experience, but for me, the fire hose of information from the backtester, Alphalens, Pyfolio and the risk model (not to mention the learning curve required just to use Python and the API), combined with no real quantitative go/no-go specification for what will get an allocation makes this a real challenge for an amateur. So, if you can figure out ways to distill out only the most salient information (e.g. how close am I to the allocation mark, on a scale of 1 to 10, and a simple explanation for the ranking), it would be most helpful.

Jan 7, 2018

@ Josh -

I'd like to understand better precisely what risk_loading_pipeline() and RiskModelExposure are doing under the hood. My thinking is the best way to do this would be for you to publish the code along with the risk-loading "white paper" Rene mentioned would be released. To get the full picture, I think we'd need code for:

Optimization API
The risk-loading pipeline (i.e. quantopian.pipeline.experimental.risk_loading_pipeline)

Could these codes be made available? Or are you treating them as proprietary at this point?

Viridian Hawk

Feb 18, 2018

Can somebody explain to me the volatility style risk? I mean I get that some stocks will consistently exhibit more volatility than others, and so if you long stocks on one end of the spectrum and short the other or visa versa you'll have an imbalance in exposure to volatility.

I guess the first thing I don't understand is, are there strategies where returns are attributed to this style (so as if it weren't defined as a "factor" you'd otherwise think of it as "alpha")? What does that look like? I guess if you long SPLV and short SPY it does generate some "alpha," or negative alpha, depending on the time frame. Is that essentially the phenomenon this risk style is there to explain? It seems so small that it's a non-issue.

Presumably this style risk overlaps somewhat, if not quite a bit, with beta though, right? Or is the beta factor subtracted from the volatility factor to arrive at idiosyncratic volatility?

And, finally, I was curious if I understand the values correctly. Do positive values for the risk factor indicate a long bias towards highly volatile stocks while negative numbers indicate a short bias towards them? So with either bias, if market volatility picks up, the strategy's returns would become more volatile, right? Low or zero volatility risk can be achieved by either avoiding high volatility stocks altogether or at the least having as much high volatility on both the long and short side,right? But if the volatility is idiosyncratic (not correlated to market), then I don't see how having a balanced exposure on the long and short side does any good -- they won't be out of phase, so they won't cancel each other out.

My intuition tells me that volatility as a risk/returns factor is going to be a negligible issue in comparison to all these other factors, or am I underestimating it?

Peter Harrington

Feb 18, 2018

@viridian Hawk
Volatility is "known" source of alpha, it actually means low volatility, check out this article.

The researchers uncovered a negative relationship between risk and return: high volatility stocks actually tended to deliver lower returns, while low-vol stocks outperformed.

Feb 18, 2018

I guess in a long-short portfolio, the ideal stocks would be those with no volatility--returns (both positive and negative) like a bank CD. Simply go long on the ones with positive returns and short on the ones with negative returns and call it a day.

Is there any evidence that the low-volatility effect applies to equity long-short? Also, is this, in fact, the risk that the Q risk model is addressing?

Viridian Hawk

Feb 20, 2018

I would think that for a long-short portfolio, stocks that exhibit a high correlation to market volatility (high beta) would be fine so far as that volatility is balanced across the long and short side, thus phase-canceling the noise. However, it doesn't seem it would matter or even help to balance long/short exposure to stocks that exhibit high levels of idiosyncratic volatility -- instead, that would probably best be dealt with by limiting the size of individual volatile positions, and perhaps by having enough of them, because if you add enough jagged noise and it should become smoother.

@Peter -- as of July 2016 that trend has had a distinct reversal. Low volatility vs SPY since produces -0.06 alpha. So I guess that itself illustrates why it's a risk! The market giveth and the market taketh away.

https://www.quantopian.com/lectures/risk-constrained-portfolio-optimization
https://youtu.be/S7yIWLXxgXs

Feb 21, 2018

A lecture notebook and lecture with some of the under-the-hood details:

Still no publication of the built-in risk model details, but maybe it is being considered proprietary and will never be published?

Josh Payne

Feb 21, 2018

Hi Grant,

We have a whitepaper in the works. We're optimistic about publishing it soon.

Thanks
Josh

Disclaimer

Peter Harrington

Feb 21, 2018

@Josh

I understand that each asset in the universe has a set of associated risk factors.
Do these values change with time or are they fixed? For example: if INTC's exposure to momentum was 0.001 on 2010/1/1 would it also be 0.001 on 2018/1/1? I feel that ideally these should change. If they do change, over what period are these values calculated? Trailing 90 days?

Not trying to poke holes in this great body of work, just trying to understand.

Thanks Josh -

Starting with a very basic description in words and accessible math in a white paper would be great. And if there are free references, that would be nice, too (I'm not inclined to buy books or articles).

I suspect that there is similarity with the Design of Experiments (DOE) methodology of process engineering (e.g. see http://www.itl.nist.gov/div898/handbook/pri/section1/pri11.htm). It is worth noting that you do not include factor interaction terms in your lowest-order linear model. What is the justification? Also, in DOE training, it is emphasized to determine the statistical significance of the factors and to drop them if they are not significant (i.e. revise the model and re-fit). If the price of tea in China explains nothing, it shouldn't be in the model. There are standard statistical tests to decide whether a given factor (actually, term in the model) should be retained or dropped. Yet, you retain all terms in the model, without testing for statistical significance--I'd also be interested in the justification.

There also seems to be some assumptions of stability/stationarity in your risk model. In DOE training, this is something that is also emphasized. Factors are analyzed to see that they are "under control" (e.g. via a control chart and other standard methods). Taking a specific example, for ShortTermReversal you are using the 15-day RSI, so if that built-in Pipeline factor is analyzed using Alphalens across the QTradableStockUS universe, what is the conclusion? Is it a stable, predictive factor, and at what level of statistical significance?

The other potential issue I see is that for the Optimize API risk constraint to work, everything has to be stable. The risk_loading_pipeline is based on a trailing window, so for it to be predictive going forward, I think certain assumptions need to be made. In the case of sector exposures, this is probably a slam dunk, since sector assignments don't change much (although I question whether arbitrary industry sectors makes sense, or if some sort of clustering technique would be better for diversification of returns). However, for the style exposures, it is not so clear. Perhaps Size and Value would be kinda stable, but Momentum, ShortTermReversal, and Volatility are not so obviously stable for a given stock from one period to the next.

On a practical note, it should be straightforward for users to pick-and-chose which risk model factors to attempt to control via the Optimize API, and which ones not to control. Is there a way to do this? For example, say I just wanted to control sector risk, and not style risk--how would I disable the style risk control?

Guy Fleury

@Grant, I concur. Your points are all legitimate. They are all questions that need answers otherwise the trading system (program) is making delayed linear decisions or making assumptions based on constants which might not be well anchored to reality.

I have expressed this before, you need your own set of outside control functions to override the Optimize API functions. Force it to do what you want it to do even if it is suggesting something else as the optimum.

One makes a profit on a trade only if Δp > 0. If you are trying by all means to clobber this price variability to submission in order to achieve low volatility and drawdowns, then what you get is: Δp → 0, a smaller and smaller profit opportunity. So, no wonder that beta-neutral and market-neutral strategies will low drawdown requirements generate so little in the alpha department.

Karl Mun

Yes, Grant to not constrain any risk factor:

opt.experimental.RiskModelExposure(  
    risk_model_loadings=context.risk_loading_pipeline,  
    version=opt.Newest,  
    min_momentum=opt.NotConstrained,  
    max_momentum=opt.NotConstrained,  
    min_short_term_reversal=opt.NotConstrained,  
    max_short_term_reversal=opt.NotConstrained,  
    min_volatility=opt.NotConstrained,  
    max_volatility=opt.NotConstrained,  
)

See threads by Abhijeet on this.

Thanks Karl,

Good to know that the capability is there. I guess the explicit constraints simply override the ones provided by risk_loading_pipeline.

@ Guy -

Yeah, I'm kinda skeptical how effective this might be for the style risks, but I only have a qualitative feel for things at this point. Maybe the white paper will include a study of its the risk model's effectiveness in controlling risks, for various slices of the QTU?

Feb 23, 2018

Hi Josh -

Rene has described the ShortTermReversal risk factor as the 15-day RSI (presumably you are using the built-in Pipeline factor). What are the other style factors, specifically? It would be nice to know. For example, I've been playing around with adding the RSI as a factor (in a multi-factor strategy), to null out the ShortTermReversal. It would be helpful to know the other factors.

Joakim Arvidsson (Cream Mongoose)

Feb 23, 2018

@ Guy -

Thanks for your comment:

One makes a profit on a trade only if Δp > 0. If you are trying by all means to clobber this price variability to submission in order to achieve low volatility and drawdowns, then what you get is: Δp → 0, a smaller and smaller profit opportunity. So, no wonder that beta-neutral and market-neutral strategies will low drawdown requirements generate so little in the alpha department.

What you may be saying is that there is that the basic concept of risk-reward still applies, along the lines of an efficient frontier. The Quantopian framework/workflow has a bewildering many degrees of Pythonic freedom. In the end, if they ran a proper Monte Carlo simulation and plotted expected return versus standard deviation in expected return, they'd end up with the classic hyperbola (and the risk-free rate would pop out...somehow not relevant to the Q team, but I suspect you and other Q users are correct on this account).

My read is that the right way to go about this business is to write an algo, parameterize it, and then run 10,000 backtests as a Monte Carlo, make the return versus risk plot, and see what it says. The Q platform is scalable in terms of parallel backtests, so the Q team should be able to do this easily. I think this would be very informative.

As a note, I'm not necessarily suggesting opening up the capability to run parallel backtests to users; I'm just saying that if Q is working on a white paper on the risk model, if they want to understand it, a Monte Carlo approach would likely shed light on the gross characteristics.

james hastie

Feb 28, 2018

Can anyone tell me how I would be able to get a list of monthly returns for a backtest? To either replace or as well as the heat map monthly returns.
Thanks in advance,
James

Viridian Hawk

Feb 28, 2018

@James, click on "returns" on the side menu.

james hastie

Feb 28, 2018

@Viridian thanks. Totally embarrassed I overlooked that, thought you had to call it from within a notebook.

Mar 30, 2018

Hi Josh,

Has the white paper on Q's Risk Model been published yet? Apologies if I've missed it. If not, any ETA when you think it may be available? Thanks.

Josh Payne

Mar 30, 2018

Almost done! I hate giving an ETA because we're always wrong but I expect it in the next week or two.

Disclaimer

Joakim Arvidsson (Cream Mongoose)

Apr 2, 2018

Great, thanks Josh! Looking forward to reading it.

Matthew Thomas

Apr 6, 2018

To anyone interested in the paper by Fama and French (1992) where the factor model was introduced - which served to improve the CAPM in terms of ability to explain variation in returns - here is a link to their paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.139.5892&rep=rep1&type=pdf

Pej H

Apr 6, 2018

Can someone provide the formula for "Combined Sharpe Ratio" in the live tear sheets when taken from in-sample and out-of-sample results?

May 4, 2018

Hi Josh -

How's the Q risk model white paper coming along?

Michael Matthews

Does anyone know the justification for the cascading regression procedure that was used. I was trying to lookup information on it, but wasn't sure what the statistical method was called.

I guess my question is: Why is the sector regression computed first followed by a regression of the sector residuals on the style metrics as opposed to just running a one stage regression? Does it have to do with a lack of independence between the sector residuals and the style metrics?

Luca

@Michael Matthews I had the very same question when I first read about the risk model. I hope we can learn more about this, hopefully reading the Q risk model white paper when ready?

Michael Matthews

I just realized it may have something to do with the difference in what is being computed between the two regressions. I just watched the QuantCon lecture on the topic. It appears that in the first stage, they are computing a factor "loading". Whereas in the second stage, they are computing the factor "return" with the loadings being the standardized metrics (e.g. maybe a cross-sectional z-score).

In other words, the first regression computes factor loading/exposures given factor returns. The second regression computes factor returns given factor loadings/exposures.

I figure this might be an explanation for the two-stage approach, but I'm just guessing. I will be on the lookout for the white paper as well.