Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
New to Quantopian - help with using two data frames to determine buy / sell orders

I'm building my first algorithm with a very basic strategy.

I simply want to find all pharma companies with a market cap between [200M, 5B] with clinical result 6 month ahead of the current trading day and buy, then sell that position 1 month before the actual clinical trial result.

My current approach is to query the clinical trials data base and somehow combine that with a query to the fundamentals data base and I simply want to run through a unified dataframe to see what I should buy and what I should sell / hold.

My question would be: How to I combine my df result from the clinical trials database with that of the fundamentals and then loop through and buy / sell / hold?

I realize both have 'symbol' columns, just not sure how to leverage them.

If it is of any help here is my code so far, thank you for any input!

# for clinical trial data  
from quantopian.interactive.data.eventvestor import clinical_trials_free

# libraries for manipulating datasets  
from odo import odo  
import pandas as pd

# For building universe of stocks for algo  
from quantopian.algorithm import attach_pipeline, pipeline_output  
from quantopian.pipeline import Pipeline

# For getting fundamental data (primairly market cap data to filter companies)  
#from 

def initialize(context):  
    #set benchmark to XBI  
    set_benchmark(symbol('XBI'))  
    # Creates and attaches an empty pipeline  
    pipe = Pipeline()  
    pipe = attach_pipeline(pipe, name = "bio_pipeline")  
    # Find all sids in clinical dataset  
    upcoming = clinical_trials_free[clinical_trials_free.clinical_phase != "Phase I" and clinical_trials_free.clinical_phase != "Pre-Clinical" and clinical_trials_free.clinical_phase != "Phase 0" ][['symbol', 'sid','asof_date', 'clinical_phase', 'clinical_scope', 'clinical_result']]  
    upcoming_df = odo(upcoming, pd.DataFrame)  
    # Filter by market cap (<$5B USD)  
    lessThan5B_df = get_fundamentals(  
        query(  
            fundamentals.valuation.market_cap  
            fundamentals.share_class_reference.symbol  
        )  
        .filter(  
            fundamentals.share_class_reference.symbol in upcoming_df['symbol']  
        )  
        .filter(  
            fundamentals.valuation.market_cap <= 5000000000  
        )  
        .filter(fundamentals.valuation.market_cap >= 2000000  
        )  
    )  
    # Filter by "if event in next 6 months"

def before_trading_start(context, data):  
    pass  
    # results = pipeline_output("bio_pipeline")  
    # update_universe(results)

# Will be called on every trade event for the securities you specify.  
def handle_data(context, data):  
    pass  
5 responses

bump?

Hi,
Sorry for the confusion here. the interactive namespace is designed exclusively for use in Quantopian Research for the purpose of understanding the data better. As the error you get when you run this algo indicates, interactive namespaced data sets aren't available in an algo at this point.

We are going through the process of adding support for the data sets via the pipeline namespace, for access to these data sets. For data sets like those from Accern, PsychSignal, Estimize and more, we've fully added pipeline API support. For EventVestor, we've added support for Buyback Authorization and Earnings Calendar. Unfortunately, we've not yet added Pipeline API support for the Clinical Trial data and so it cannot currently be accessed in an algo. We definitely have support for this in our backlog. It would likely work similar to how the buyback auth and earnings calendar pipeline factors work today.

Sorry,
Josh

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hey Josh,

Thanks for the reply!

Do you have any ideas for work arounds? i.e. can I filter the clinical data in research and then export that as a .csv to be uploaded to an algorithm?

Unfortunately no. Part of our agreement with our data partners is that we will prevent the data from leaving the platform (it's how we can get you a low price for the monthly subscription). It's currently a matter of us building factors for these data sets one by one (for the data sets that don't work with our default integration).

+1 for the clinical trials Pipeline factor