Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
The Social Media Trader Mood Series Pt. 2: Research Design

Research design is a fundamental and often over-looked part of the algorithm creation process. It is the simple point where the grounded quant will ask themselves, "What is my universe?"; and "What is my training/testing dataset?" These two simple questions lay a solid framework from which to frame and validate results from factor research and backtesting.

Here's an overview of what you'll learn from this notebook:

  • How to breakdown a list of securities by liquidity baskets
  • How to set guidelines for your universe of securities based on capital base and data coverage
  • How to validate your universe constraints with your in-sample datasets.

I highly recommend following this series in chronological order. The first part of this notebook covers the first and most important step of strategy creation: data examination. You can find links to each section below as they become available.

And in terms of pacing, the bolded section is where you are now:

  1. Introduction - Examining the data. My goal here is to simply look at the dataset and understand what it looks like. I’ll be answering simple questions like, “How many stocks are covered?”; “Which sectors have the most coverage?”; and “What’s the distribution of sentiment scores?”. These are very basic but fundamentally important questions that lay the groundwork for all further development.
  2. Research Design - Here, I’ll be setting up my environment for hypothesis testing define my in and out-of-sample datasets both cross-sectionally and through liquidity thresholds.
  3. Hypothesis Testing - This is where I’ll be setting up a number of different hypotheses for my data and testing them through event studies and cross-sectional studies. The Factor Tearsheet and Event Study notebooks will be used heavily. The goal is to develop an alpha factor to use for strategy creation.
  4. Strategy Creation - After I’ve developed a hypothesis and seen that it holds up consistently over different liquidity and sector partitions in my in-sample dataset, I’ll finally begin the process of developing my trading strategy. I’ll be asking questions like “Is my factor strong enough by itself?”; “What is its correlation with other factors?”. Once these questions have been answered, the trading strategy will be constructed and I’ll move onto the next section
  5. Out-Of-Sample Test - Here, my main goal is to verify the work of steps 1~4 with my out-of-sample dataset. It will involve repeating many of the steps in 2~4 as well the use of the backtester (notice how only step 5 involves the backtester)

As this is my first time working through this flow, the steps above are subject to change as I learn and iterate through my mistakes. Feel free to post feedback and questions.

Quick Notes:

Pipeline: import quantopian.data.pipeline.aggregated_twitter_withretweets_stocktwits_free  
Research: import quantopian.interactive.data.psychsignal importaggregated_twitter_withretweets_stocktwits_free

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.