Feature scaling for multiple assets with sklearn/random forest

Quantopian's community platform is shutting down. Please read this post for more information and download your code.

Back to Community

posted Dec 13, 2019

Hoping for some advice in regards to training a single Random Forest classifier model on 200+ assets, that will predict the direction of the next bar/candle. This is a learning experiment for me where accuracy is not the primary goal - i'm more interested in how to tackle the data processing for building a single generalised model for multiple assets.

I have a dozen or so features which are derived from price (simple technical indicators and statistical calculations), the label/prediction is the direction of the next candle/bar.

What is the preferred way to scale the data (standardize/normalize/log etc) across all the asset features? The price, volatility, and returns are wildy different between assets eg. Asset 1 is $0.30 with +/- 10% stddev of returns, and Asset 2 is $2500 with a +/- 2% stddev returns.

Many thanks,
J

4 responses

Robin Gane-McCalla

Dec 20, 2019

You can use sklearn StandardScaler but you shouldn't need to. A Random Forest will divide your data to maximize information gain

https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html

Jimmy Fixin

Dec 20, 2019

Thanks Robin. If i didn't standardize as suggested, and left price (or returns) as absolute values for all 200 different asset classes- wouldn't that create a large number of unnecessary splits within the tree? On a single asset class it would be fine not to standardize with RF, just not sure with 200?

Robin Gane-McCalla

Dec 20, 2019

I'm not sure what you mean - can you share some code?

Arun

Jan 2, 2020

@Robin can you share some code on how you use random forest?

You've successfully submitted a support ticket.

Our support team will be in touch soon.

Need help? Contact support.

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian.

In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Careers

Events

Status

Twitter

YouTube