Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
London Meetup: Machine Learning and Non-Stationarity

NOTE: The example in the notebook no longer works due to MLPClassifier no longer being whitelisted by our security system. I've put in a request to have it whitelisted, but no promises on timeline as we are unsure whether the code implementing the method is insecure. The pre-rendered examples should still make sense.

I gave a talk tonight in London on how non-stationarity can negatively affect machine learning techniques. TL:DR if your underlying DGP conditions are changing and you're not careful, by the time your ML process has learned what's going on, conditions may have changed. This is not evening considering the over related problem of overfitting.

Some great suggestions came up on improvements that could be made to this notebook. Including:

  • Try adding more layers/neurons to the neural network to get a sense of how much of the problem is non-stationarity.
  • Try running it on returns rather than prices.

This notebook is more of a learning example. For some great examples much closer to real -- in-practice -- work check here.

https://www.quantopian.com/posts/machine-learning-on-quantopian

Here are my notes in notebook format, I also covered this lecture.

To sign up for future London events, go here. To see all our upcoming events, go here.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

11 responses

Thanks for the post, some questions though:
- The issue of dealing with time-series data to generate 'real-life' data is under philosophic/academic debate for quite a while. It seems you have a clear view about this, but can you elaborate a little when it's recommended to 'take care' in the data (check for stationarity/cointegration and deal with the outcomes), and when we can do without?
- I didn't see any indication in the article for smoothing process(maybe I missed something..). At least for adfuller test, smoothing (in any model) can do the job. Is there any reason you didn't get into it?

Hey Shmulik,

Great questions.

  • What pre-processing steps need to be done depend on the hypothesis you are testing on the data. Each statistical method has a different set of assumptions that must be met in the underlying data. Linear regression, for instance, requires: linear relationship, multivariate normality, no multicollinearity, no auto-correlation, and homoskedasticity. Based on the hypothesis you are testing and the underlying assumptions, you'd need to test for different conditions in your time series. In general non-stationarity is a good one to test for. Once you know it's non-stationary you generally want to try to model it in different ways. Some examples would be, take the first order deltas and model those if they're stationary, or use a model that takes non-stationarity into account.
  • Smoothing processes, such as the Kalman Filter can be very useful. However, care must be taken to ensure that you aren't just masking out and ignoring some dangerous effect. If you have a highly volatile series, smoothing it might make it well behaved. However, the real world will still be volatile, even if your processed time series isn't, so you could be taking on more risk than you expect. We don't have a specific lecture on moving averages, but this lecture covers them.

Thanks Delaney,
A practical concern, do you have a recommendation where to draw the line between smoothing process that is OK, and one that 'corrupts' the data ('highly volatile series' could be in a huge range of possibilities)?

It all comes back to framing everything in terms of testing a hypothesis. What hypothesis do you want to test? For example: "I hypothesize that taking the average market beta over the last 100 days for a given portfolio will be predictive of the market beta over the next 30 days." The line will be different for each experiment and depend on how and why you're using the smoothing. I think as a general rule of thumb, use smoothing when it's a natural solution to a problem, not just on data that you don't like or doesn't behave well. In general if data doesn't explain an outcome well, blindly applying smoothing or any fancy transformation to it will rarely help. You usually have to find another data source, or realize that there's a specific issue in the first data source that will be fixed by the transformation. If it's noisy then smoothing can help clean up the noise, but it will also come with a list of other issues you have to consider. Sorry I can't provide more specifics, it's really case dependent. If you show me a specific example I can provide some specific feedback.

Hi Delaney -

One interesting case would be to consider the use of daily OHLCV bars, versus price values that are average daily values, such as daily VWAP (from minute bars, for example). For trading, such as one might do in the context of the proposed workflow, it would seem that working with price values that are representative of entire day would provide a better basis for evaluating and implementing factors, than single trades represented by the OHLC values. Is there any evidence that for the type of equity market neutral investing that you are working to support, smoothed price/returns data would work better than daily OHLCV bars? As a specific example, say I had a factor that determines the slope of the price trend over the last 5 days. It seems that the error in the slope will be a lot lower if I use daily VWAP values for the prices, versus closing prices (but I could be wrong about this, if the price tends to settle to a single narrow range by the end of the day).

On a specific side, as Grant noted, sometimes the 'transformed' data would work better, and the question than become 'are we still dealing with the same data'?
I am trying to implement some ML algorithms to classify intraday movements for specific assets. The smoothed data deliver much better outcomes (I used Double exponential smoothing on the minute prices) and the Ad-fuller test is passed almost 100% of the times. The troubling issue is are the predictions on the testing cases hold on in 'real-life' data after this massive smoothing?
The reality check I made was to put both the regular & the smoothed data upon one grape and see how different they are. After ~50 min' the plots start to consolidate, which in no surprise taking into account the high volatility in this specific asset at the day start.
I suppose the best way is to try to optimize the constant parameters of the smoothing algo's for low SSE, such that passing the Adfuller test, but I would like to do without if possible.
(**at least until Quantopian will supply APIs for that..)

The real question at the end of the day is does the outcome of your model on the smoothed data predict future real data values. You can think of the smoothing as part of your model, and at the end of the day the main thing is what are the characteristics of the residuals between your model predictions and the out of sample behavior of the real data.

If smoothing the data allows you to get at some key underlying state which is useful for forecasting future behavior, then great. Obviously you'd want to investigate why, and probably understand a bit more about the system before trading something like this. In general if you discover a consistent effect in which smoothing in a model produces good forecasts, then like any scientists you want to poke around the new areas of uncertainty. Why is this happening? What about the smoothing is causing this to happen? What does this tell me about the system. From this poking around you may come up with an even better model. A commenter on one of our recent podcasts with CWT posted a good video that I think describes this process of poking around well.

https://youtu.be/PzssYxaZ5aU

The podcast can be found here: https://chatwithtraders.com/quantopian-podcast-episode-5-max-margenot/https://chatwithtraders.com/quantopian-podcast-episode-5-max-margenot/

Hi Delaney -

I'd asked a pretty specific question above:

Is there any evidence that for the type of equity market neutral investing that you are working to support, smoothed price/returns data would work better than daily OHLCV bars?

Do you have any specific guidance? I guess my hypothesis would be something like "Rather than using daily OHLCV bars for a market-neutral strategy for the Quantopian hedge fund, it would be better to use averaged data, such as daily VWAP values." Any thoughts on how to test the hypothesis (or re-frame it, and then test it)? My intuition is that if one is to make daily/weekly trading decisions, then with daily OHLCV bars for computing returns, one is effectively under-sampling, at a lower signal-to-noise than would be optimal. Rather than relying on intuition, perhaps there is a systematic way of approaching the problem?

TL:DR Will be different for each factor/predictive model using the data. Factors modeling intra-day behavior generally need intraday data. Factors modeling behavior over next 1-10 days generally need daily samples at a minimum. Systematic rule of thumb is that if you want to predict how the world works over the next time step with resolution R, you need at least 30-60 samples at frequency 1/R. For a given factor, test which data works better by computing predictions on both, looking at the residuals between both predictions and the actual outcomes, then picking the ones that are closer and performing an out of sample test to validate.

More info:

Apologies I missed that specific question. I would say that it depends on what you're trying to model. If you're trying to determine what the average state on a day might be, then certainly average state on previous days may be informative. If you're trying to determine close price, then maybe close price may be informative. In general if you're trying to make daily trades, you want enough history to build statistical confidence in your predictions, rule of thumb is at least 30-60 samples. So if you were trying to estimate a trend in daily closes, you'd want to look at the last 30 daily closes to see if a linear regression showed any consistent slope. If you were trying to estimate a trend over the next couple minutes, you'd need the last 30-60 minutes, etc.

More data is almost always better, so having the VWAP in addition to the OHLCV would most likely improve any model trying to forecast price trends. My systematic way is usually the following:

Agnostic of what data is available, do I have any ideas for models that could be used to forecast future returns?
Given this idea, what data would I need to collect to validate the model works?
If this data is available, collect it. If not, is there a very close substitute that we've already shown behaves similarly. If not, then the model is untestable and cannot currently be used.

Again, different models will be more or less sensitive to the use of say OHLCV vs VWAP. I believe, but have not tested, that OHLCV will generally behave fairly similarly, especially when zoomed out to 30-60 days, but may not be good enough. If your holding period were say 1 hour, then you would almost definitely need minutely data to make effective forecasts. If your holding period is a few days, then you can look at the last 30-60 days to get a sense of what's going on in the world. Looking at the behavior of the last several hours may be helpful for some factors, and may not be helpful for others.

Is this helpful? If not I apologize it's Friday evening here and I'm still a bit jet lagged so my brain is fairly burnt out.

Thanks Delaney -

Hope you are relaxing now with your favorite beverage and can turn your brain off for awhile.

Your feedback is helpful. The 30-60 samples is kinda the minimum to be in the realm of large-sample statistics (versus ~ 6 samples for sketchy statistics). The other angle, which perhaps is consistent with your resolution R, 1/R sampling example is to assume the phenomenon of interest is oscillating at a frequency of f = (day)^-1. The absolute minimum sampling rate then needs to be 2f, which daily OHLCV bars kinda-sorta satisfies. But to find f and its phase accurately, one needs a long look-back window if the sampling rate is limited to 2f. If f = (5 days)^-1, things are a bit better with daily OHLCV bars, but still sketchy due to noise and the limited sampling rate.

Maybe there is evidence to the contrary, but I'm wondering if limiting long-short factors to daily OHLCV bars will be a problem. Under normal market conditions, would one expect persistent relatively large effects on a 30-60 day time scale that would be profitable? I'd think that information would be absorbed more quickly, no?

One way to approach this would be to use the Q500US or Q1500US and compute the distribution of daily returns using daily OHLCV bars and daily VWAP, for comparison. The daily VWAP would be computed using minute bars (I think it is conventional to use V*(H+L+C)/3 for each minute bar). My hunch is that the distribution in daily returns will be considerably narrower when computed using daily VWAP (computed using up to 3*390 = 1170 price values per day). Additionally, there should be an improvement in accuracy, due to the volume weighting (assuming larger volume trades better represent the true market price).

I would say that it depends on what you're trying to model. If you're trying to determine what the average state on a day might be, then certainly average state on previous days may be informative. If you're trying to determine close price, then maybe close price may be informative.

Yes, but my understanding is that the Quantopian daily OHLCV data are derived from single trades, of unknown volume. One can imagine strategies that are close-to-open gap type trading, but even then, it would seem that averaging over N trades near the end of the day would be preferred (or using an EWMA).

My sense is that the run-of-the-mill long-short approach described in the blog post will be concerned with "the average state on a day" versus individual extreme values, i.e. daily OHLCV bars. Maybe Jonathan Larkin or other industry specialists can speak to this. Is there a conventional approach to the problem of what data to use to represent daily prices for computing returns? Naively, I'd think that one would want the highest signal-to-noise ratio for the daily price, and a value that best represents the price for the entire day.