Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Insider Trading Algo - Need help!

I could use a little help with this. For some reason it seems to function when I test it, but it doesn't when I run it live. I mean it runs but never makes any trades, even when all conditions are met.

Here's what it's supposed to do described:

It's supposed to run once a day, near market close.
It pulls insider trades for the day from the URL in the code, which has realtime insider trades in CSV format. (This data auto updates).

It will then place trades based on the data received via the URL if the following conditions are met...

  • stock price must be between $1 and $5 (assigns 5 points)
  • 'insider name' field must contain one of the following - "President", "CFO", "Director", "CEO", "COO", "VP", "Chief", "Executive" (assigns 5 points)
  • Total value of the insider trade must be greater than $5000 (assigns 5 points)
  • Stocks 90 day performance must be greater then the SPY 90 day performance ( assigns 1 point)
  • Stocks current price must be overall better than its 365 day avg. (assigns 1 point)

For each of the conditions listed above the algorithm assigns the listed points above to calculate whether or not to go ahead and purchase. If total points is greater than 14 it's supposed to make a buy order for the stock.

This algo cant be effectively backtested because the data URL is realtime. However when I backtest it it seems to work and knows when to make the buy. But when I deploy it live, it's never made a buy, even when conditions are met.

Anybody got any ideas on how to get this running?

8 responses

I think the problem is that your fetch_csv is only occurring at initialize.

Use schedule_function to indirectly call fetch_csv before you call calculate_scores.

I don't believe that your my_universe function is ever called. The property universe_func of fetch_csv is not documented. Instead, use post_func = my_universe in the fetch_csv() call. Note that you will have to reduce the parameter count of my_universe to one in order to compile.

It is up to you if you continue to calculate_scores from the schedule_function or if you decide to call it from the my_universe function that will be called after the data is obtained from the network.

Lastly, there may be an issue of caching, and I am only speculating here. You may want to add a timestamp when you make the fetch_csv() call such as data_link + "?ts=" + str(time.time())

fetch is also called at trading_start so be sure your data is available at midnight

Whimsbee,

if I move fetch_csv out into calculate_scores I get the following:

"FunctionCalledOutsideOfInitialize: 0027 'fetch_csv' only permitted within initialize function There was a runtime error on line 43."

Eeeek

Eeeek is right! Obviously I didn't read the docs well:

https://www.quantopian.com/help#overview-fetcher

When I was testing out your code yesterday, it occurred to me that it was odd that you only called fetch_csv in initialize. According to the docs, the data will be fetched around midnight (eastern time) and must somehow be similar to the data that existed before. Seems odd to me how that works, you should be able to grab the data at a time that you specify, right?>!

I am pretty well confused what the doc means by It's important that the fetched data with dates in the past be maintained so that warm up can be performed properly; Quantopian does not keep a copy of your fetched data, and algorithm warmup will not work properly if past data is changed or removed. Data for 'today' and dates going forward can be added and updated.

Anyhow. Sorry for the confusion, I will state again that the docs do not mention the universe_func parameter. But, that said, I do see it used in examples from what appear to be Quantopian developers, I don't see it used in all examples. The reason I brought it up was that I put print()s in your my_universe function and did not get any printouts. https://www.quantopian.com/posts/new-feature-fetcher

So: Looking at the data, I do see an issue that there are line breaks in the csv data. It appears that someprogram is line wrapping the data. It appears that this is causing the data to be misread. Add this code to the beginning of your cleanup_colums function, you will see that the line breaks in the CSV file are causing an issue:

    print fetcher_data.tail()  
    print "see the problem? some invalid data caused by those line breaks in the csv file..."  

First fix these line breaks: either in this method my modifying the dataframe or by fixing the actual csv file. Then I would suggest sorting the data ascending by date and time so that all the old data is in the same order.

This might be intersting, and you can search for more in the forums by looking for fetch_csv. https://www.quantopian.com/posts/is-fetch-csv-still-not-allowed-in-live-trading

Lastly, leave the print in your clean_up_columns so that you may check it out when you are running your forward tests.

Good Luck.

Eeeek as a side note, your fetch_csv url seems to be unreachable now. While testing your algo yesterday, I found a couple of things:

1) The reason why your algorithm does not place any orders during live trading is because the imported csv file does not contain data for the current trading day. The way fetch_csv works is by using the current date as an index to the DataFrame constructed from your csv file (index specified by date_column), and making data that corresponds to that index available to your algorithm.

This behavior is similar in the backtester, the difference is that data is forward-filled for days (indexes) missing in the csv file. This is why the algo seems to work while backtesting. You could try writing a pre/post_func using pandas that forward-fills data in live trading as well. I have been working on this myself, but I haven't found a general, robust solution yet.

2) The csv file returned by your url seems to change over time.

This relates to Whimsbee's quote from our live trading guidelines and limitations. Before a live trading session starts for the current day, your live algorithm needs to "know" its current state (current portfolio state, context's variables values, etc). In order to accomplish this, it is simulated from the day it started live trading up to the last trading day. For this reason, your algorithm must remain deterministic, meaning that it has to take the exact same decisions it did when it originally live traded. Deterministic behavior will be compromised if the csv file does not keep the column values used on previous trading days, or if these values change.

With all that being said, we are planning to add an Insider Trading dataset to our Data Partners program in the near future. This would give you access to the dataset natively through our Pipeline API.

Regarding the universe_func parameter, support for it was dropped on the last update to the platform (04/20/2016). The old API required you to specify a trading universe, and this parameter allowed you to use a function to update your trading universe definition. This is no longer needed, with the new API your trading universe includes all available securities (~8000). You will see universe_func present mostly in older community posts.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Ernesto-

Thank you for jumping in here and providing some insight. I am new here, please do tell more. I would like to request that we talk about features that do exist rather than about features that are likely to exist in the future :) - I am sure there are many, but I can use none :(

So the live trading session replays using historic data to arrive at its state every day? Why not just go forward with the the live trading session and maintain the state rather than regenerate it daily?

Can you please clarify exactly which fields are required in the file obtained via fetch_csv(). I see some instances where this is called with a symbol parameter indicating that the symbol is not required. Perhaps I am overlooking it, but I do not see where the suggested/required fields are defined in the docs.

Does it matter what order the data is in for the historic data that you are suggesting must remain in the csv for proper operation. Should new data always be added to the end or is there some index that is used that allows this data to be placed arbitrarily within the csv file? Does this logic account for the fact that a particular csv file might have thousands of entries on any given day?

If the data is collected at midnight, how is it possible to have data that points to the trading day? Must I always forward fill data in the csv file with future dates so that when the data is collected it will be relevant? Or I think I am missing some point here. Maybe it is important for the algorithm to make decisions based on yesterday's data.

Last, is it possible to collect data from a url in the middle of a trading day? Say I have some external algorithm that is able to generate csv data that is relevant to the last say 10 minutes. It is possible for an algorithm to request this data every 10 minutes to effect it's operation. I believe the answer is no, not at this time.

Again, thank you so much. I hope @peter, you are able to get your problem solved..

-JJ

I haven't tested it, but you may be able to call initialize from other functions. It's really hackish, but it's something I had to do with Kivy b/c python init happened before the widget was populated, and the widget didn't properly exist yet, so you couldn't touch it yet, so it needed a custom .init() method that depended on init having already happened. So essentially you had to call .init before you could use your stuff that depended on dynamic parameters. I see three things:

One, the fetch API isn't allowed outside of init because you could fetch from long URL names to exfiltrate data every minute, which is not nice.
Two, you're fetching a dynamically generated file. You have to be using an API that has date codes in the url for backtesting to work.
Example:
If you were trying to say "get direction of price movement at trading stop as announced by a trading stops tracker, then trade it again as soon as trading reopens", you have to provide datetime data in the URL to backtest it. It seems like you know that though.
Three, if your test fails when it runs once EoD, you have two possible problems imo. 1)The get insider trades code never gets hit when you need it or 2)You use an order type that expire end of day.

Just thoughts.

Also, I like what you're trying to do. I've looked at the same thing with forced stops on trading and insider trading before.

For what it's worth, pandas-datareader has an edgar module.