Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
The 7 Reasons Most Machine Learning Funds Fail by Marcos López de Prado

This talk, titled The 7 Reasons Most Machine Learning Funds Fail, looks at the particularly high rate of failure in financial machine learning. The few managers who succeed amass a large number of assets, deliver consistently exceptional performance to their investors. However, that is a rare outcome. This presentation will go over the 7 critical mistakes underlying most financial machine learning failures based off of Marcos López de Prado’s experiences and observations.

The slides for this presentation can be found here.

Bio of the Speaker:
Dr. Marcos López de Prado is the chief executive officer at True Positive Technologies LP. He founded Guggenheim Partners’ Quantitative Investment Strategies (QIS) business, where he applied cutting-edge machine learning to the development of high-capacity strategies that delivered superior risk-adjusted returns. After managing up to $13 billion in assets, López de Prado acquired QIS and successfully spun out that business in 2018.

López de Prado is a research fellow at Lawrence Berkeley National Laboratory (U.S. Department of Energy, Office of Science). A top 10-most-read author in finance based on SSRN's rankings, he has published dozens of scientific articles on machine learning and supercomputing and holds multiple international patent applications on algorithmic trading.

Marcos earned a Ph.D. in Financial Economics (2003), a Ph.D. in Mathematical Finance (2011) from Universidad Complutense de Madrid, and is a recipient of Spain's National Award for Academic Excellence (1999). He completed his post-doctoral research at Harvard University and Cornell University.

Learn more by subscribing to our YouTube channel to access all of our videos.

As always, if there are any topics you would like us to focus on for future videos, please comment below or send us a quick note at [email protected].

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

18 responses

Thank you, what a great presentation! The link to the Google drive where the presentation slides are stored doesn't appear to be working however. I get the below error message when trying to access it:

Sorry, unable to open the file at this time.

Please check the address and try again.

Glad you enjoyed the presentation! I just updated the link in the post so it should work for you now.

Great presentation. Thanks

Thanks for posting this work by ML de Prado. I read his book, was interested in the idea of "fractional differentiation", played with it for a while, then got distracted by something else and forgot about it. Thanks for bringing this back to mind. See in particular Slide 15, Example2:
" Most financial series can be made stationary with a fractional differentiation of order d< 0.5
However most financial studies are based on returns where d = 1"

Very often in my own trading system development work, instead of using some selected variable, i use what i call its "PPD" or Proportional Plus Derivative, which i define as Variable + dVariable/dt or, in discrete time intervals, as Variable[i] + (Variable[i] - Variable[i-1]), with or without some weighting applied. My rationale is that this "PPD" approach of incorporating some fraction of the derivative as well as the original signal, gives us some degree of predictivity or "pseudo-look-ahead" for the next forthcoming data value, albeit at the expense of adding a bit of noise, and sometimes its worth it to do that.

I know that using 0.5*(Original Data + 1st Derivative) = 0.5*(Derivative of Order 0 + Derivative of Order 1) might not be exactly the same thing as de Prado's "Derivative of Order 0.5", but his comments on this do give me added confidence that i am on the track of something useful in trading system development.

Has anyone else had some success using this or a similarly simplified approach in your trading systems / algos?

Thanks Tony for posting how a non-integer derivative can be calculated. I saw the video and understood the concept, but Google searching it did not provide me with an answer. I wonder if non-integer diff can be used in ARIMA analysis, like an ARIMA(0.5, 0, 0.5). But aside from that, no, I have no experience with it.

/L

Hi Luc, what i'm doing with this (what i call) "PPD" concept really comes from control systems theory that i studied about 30 years ago, and i'm not sure if it is actually the correct way to approach fractional derivatives, but i think it is probably at least a reasonable approximation to it. I can't make any comments about its application in the context of ARIMA analysis. Please let me know if you find out any more about it. Cheers, best wishes, Tony.

Tony, you might be interested in the following:

https://youtu.be/Nu4lHaSh7D4

As for the "FRACTIONAL DIFFERENCING" there is a nice new book from TC Mills "Applied Time Series Analysis":
screenshot from a relevant page I have no experience with that yet, but will evaluate it in next weeks.
Thanks also for publishing the link to the slides. It also moved back Prados work to my focus after some time. :)

Hi Hannes, thanks for this. I have a number of books on time series analysis but i haven't see this one yet as it is still quite new (Feb 2019). I look forward to reading your evaluation of it, and in particular its potential relevance to trading. For anyone else following up on this topic, it is worth taking a look at what else Amazon has to offer under the same keyword search "applied time series analysis". Best wishes :)

While I haven't read his book, I believe what ML de Prado is referring to in "fractional differentiation" is probably in the context of ARFIMA:

Autoregressive fractionally integrated moving average
From Wikipedia, the free encyclopedia
In statistics, autoregressive fractionally integrated moving average models are time series models that generalize ARIMA (autoregressive integrated moving average) models by allowing non-integer values of the differencing parameter. These models are useful in modeling time series with long memory—that is, in which deviations from the long-run mean decay more slowly than an exponential decay. The acronyms "ARFIMA" or "FARIMA" are often used, although it is also conventional to simply extend the "ARIMA(p,d,q)" notation for models, by simply allowing the order of differencing, d, to take fractional values.

"Fractional differencing" by J. R. M. HOSKING is one of the first published papers on the subject.

Thanks Chak.
Hoskings original publication was in "Biometrika" and it is interesting that the literature in computational biology contains a wealth of excellent ideas that are potentially applicable to trading.

What i am trying to find now are some simple, easy-to-implement mathematical definitions for fractional difference operators that can be computed simply in terms of lagged data points at [i], [i-1], [i-2] etc and do not involve unnecessary complexities of Gamma functions, Hermite polynomials, etc. Can anyone who has done this in the context of practical trading and found it to be useful please assist further?

@Hannes, your "screenshot from the relevant page" looks interesting. Please, can you provide a little more? .... in particular the lines that follow regarding the Binomial expansion of the difference operator DELTA^d for any real d > -1.

Hannes, thanks for this. Best wishes, Tony.

Tony,
Here is an implementation of the fractional derivative following DePrado's book(Chapter 5).
Philippe Rémy Frac Deriv, from dePrado

His fixed window approx says that if the coeff are close to zero, you might as well truncate them, and then you get some finite dot-product of the time series data with the coeff seq constructed from d....sorta like what you are doing actually...
alan

Good afternoon!
The result is appreciated everywhere in the world.
You are a professional with great experience. I ask for your advice. Please, tell me. How and where do I find an investor? I have a higher technical education. I am an engineer. I have been watching the markets since 1989. This is my hobby.
I developed my individual approach to financial markets. It was very hard. I am ready to prove my words.
My system generates a lot of profitable signals in different markets. I do not have enough capital. I am ready to sell some signals that I can not trade because of the lack of the required amount of capital. It can be intraday, 1 day, 1 week, 1 month, 3 months, 6 months, 1 year. I find the entry point to the market with minimal risk and good potential for profit. The risk / reward ratio is from 1 to 2 or more. Your opinion will be very important to me. I thank you in advance. Sincerely, Trader Analyst.

P.S.

I live in Ukraine. In Ukraine financial markets are underdeveloped. I ask your advice.

The result is appreciated everywhere in the world.

I wanted to post a forecast for MMM stock on Seeking Alpha.
This was on May 1, 2019 before the market opened.

Here is this forecast.

I am considering selling 3M Company (ticker symbol: MMM). Overnight position market risk hedged. Stocks from one sector.
Current Price: 189.45
Possible movement to the level: 186.5; 183.0; 180.0
I will report more information about this ticker.

This forecast has not been posted.

Seeking Alpha editors answered:

Thank you for your recent submission to Seeking Alpha. Your article "I Am Considering Sell 3M Company (Ticker Symbol: MMM). Overnight Position Market Risk Hedged. Stocks From One Sector." has been declined by our editorial team. Our editors had the following feedback:
Thank you for the submission, but we're going to pass. While we appreciate you translating this into English, the article is much too brief for publication. Keep in mind that we're looking for authors to provide actionable advice based on detailed, in-depth fundamental analysis: a deep dive into company financials, analysis of the competitive scene, risks/challenges to the thesis, etc., to present a unique investment perspective that our readers can’t easily find elsewhere. As such, a few sentences saying you're going to sell MMM doesn't work for publication.

The result is appreciated everywhere in the world.
I propose to see the result of this position.

3M Company (ticker symbol: MMM). SELL.

I propose to calculate the potential profit. Volume position 1000 shares.

$ 189.45 x 1000 shares = $ 189,450. This is the invested capital.

On my system at the price of $ 186.5, 50% of the position was closed.
$ 189.45 x 1000 = $ 189,450 $ 186.50 x 1000 = $ 186,500 $ 189,450 - $ 186,500 = $ 2,950. This is 1.5% of the invested capital.

We continue to hold the position.

$ 186.50 x 500 = $ 93,250

In my system, at a price of 183.5, 50% of the position was closed.

$ 186.50 x 500 = $ 93,250 $ 183.50 x 500 = $ 91,750 $ 93,250 - $ 91,750 = $ 1,500. This is 1.6% of the invested capital.

We continue to hold the position.

$ 189.45 x 250 = $ 47,362.5

May 6, 2019 the stock was reduced to $ 180.13
Market closed at $ 183.04.
In my system, the position should be closed at $ 180.50

$ 189.45 x 250 = $ 47,362.5 $ 180.50 x 250 = $ 45,125. $ 47,362 - $ 45,125 = $ 2,237. This is 4.7% of the invested capital.

Profit for four trading days.
$ 2,950 + $ 1,500 + $ 2,237 = $ 6,687. 1.5% + 1.6% + 4.7% = 7.8%

By the way. The position volume could be increased 10 times. This is for this ticker.

I do not have enough capital. I am ready to sell such ideas on mutually beneficial terms.

They also fail because machine learning and AI is over-hyped right now. This often causes people who are not experts in this area to rush (like in Gold Rush) to areas related to AI. Unfortunately, for those people, before have to deliver results (unlike in some cases for startups where startups can only assemble a team and build a semblance of a product and get an exit). On top of that, application of data analysis in Finance is quite different than application of data analysis elsewhere and requires expertise in both data modeling and the financial domain - knowing about psychology, bubbles, crashes, technical signals, taxes, risk, market regimes, etc. Today it is not clear if even any sophisticated modeling is required in the financial area similar to image recognition or speech recognition with neural networks. Most likely no body has enough data to support that. More over it looks like simple hypothesis tests (tracking and comparing distributions) and regressions might be enough. Even though those over-fit very quickly. Changes of market conditions may cause an unprotected AI fund to drop by 100%s of percent if it had picked up on some short term anomaly and assume it to be a rule. It could have been people flocking to a highly risky financial product eventually causing the bubble to burst.

Supervised ML for: parameter tuning, universe selection, portfolio implementation (adaptive executions)