How to order a stock using the Yahoo price

Back to Community

posted Mar 17, 2014

I prefer to use zipline for development (debugging is much easier) using Yahoo data. I want to use Yahoo data, using fetcher, when I port the strategy to Quantopian, so that I can make sure it works the way it did with zipline. Problem is that I need to 'order' a stock using the stock's sid, so the order is placed at the sid's (Quantopian) price, not the Yahoo price.

I put together this small demo to explain what I mean.

My question is: how can I order using the Yahoo price? (I could 'massage' the data object, but that's kinda messy)

10 responses

Jessica Stauth

Mar 17, 2014

Hi Dave,

Quantopian's pricing data is trade data, we show the last traded price (for each minute or each day depending on the backtest mode) for a stock regardless of what exchange the trade was made on. Yahoo displays an 'end-of-day' (EOD) datasource. Yahoo and other EOD data providers get their price and volume data from the official exchange record. The EOD sources rarely exactly match data derived from intraday data. For instance, the official close for a NYSE stock is the last trade of the day for the stock on NYSE. But if the stock also trades on Chicago, Pacific or another regional exchange, the last trade on one of those exchanges could be our close.

I'm not aware of a way to simulate fills at the official end of day close price inside of Quantopian currently (if we add an EOD source down the line you'd be able to do that). But I think more importantly that if you plan to trade your strategy using Quantopian's Interactive Brokers integration, using Quantopian's trade level intraday derived data is actually going to be a better approximation of the execution you will get. For most stocks of reasonable liquidity the last traded price across all exchanges should be quite close to the official EOD number you'd see on Yahoo, and if you strategy hinges on the differential there then it is probably something that won't hold up in live trading - at least within the constraints of the Quantopian platform.

Let me know if you have other questions, or if this doesn't answer your question completely.
Best regards, Jess

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Dan Dunn

Mar 17, 2014

The Quantopian backtester can only buy/sell sids in our database using our prices. We don't have the ability at this point to run backtests with other price series.

You can, of course, import other prices and data to use as a signal, but you can't buy/sell them.

I'd love to build the "buy/sell anything" feature which would enable things like running backtests on bitcoin data, or online gaming website, or anything like that. But for now we're more focused on real-money live trading.

Disclaimer

Dave Gilbert

Mar 17, 2014

My problem is that I have developed a strategy using zipline (which can't access Quantopian data) and on porting to Quantopian, I get markedly different results. Hence the need to use the same data (even if it's not the best data) in both cases to make sure it's not a bug in the strategy. Bit of a bummer that it's not possible, but I understand your priorities.

If orders can only be placed using your prices, won't there possibly be differences with IB's prices as well? I've just stared experimenting with my IB paper account - hence the question.

Dan Dunn

Mar 17, 2014

I think the problem actually disappears when you get to IB trading because there is only one price - the price that your order was filled at by IB. There's no second-guessing that one! Quantopian uses IB's order fill data.

Going back to your original problem, I'm interested in helping to track that down. In general, Yahoo's prices and our prices are so similar that the difference shouldn't be significant. The fact that Yahoo limits you to daily data might be the problem, depending on on the specific strategy.

When I started digging into your algorithm, it looks like you're not pulling in all the data that you think you are. Check out the results of the record() function - it's not the full data set.

Disclaimer

Dave Gilbert

Mar 17, 2014

The example I sent wasn't the strategy I was referring to - just a demo to show that the order used Quantopian data and not the fetched Yahoo data. I've attached the strategy FYI, together wit the zipline code. The Quantopian return for the period 1 Jan 2007 - 1 Mar 2014 was ~150% vs ~250% with zipline and Yahoo data.

Here is the zipline code as well, in case you want to check it out:

# inspired by Cliff Smith  
# http://seekingalpha.com/article/1928941-new-tactical-asset-allocation-strategy-to-grow-retirement-savings-at-reduced-risk

from zipline.utils import tradingcalendar  
from zipline.algorithm import TradingAlgorithm  
from zipline.finance.commission import PerShare, PerTrade  
from zipline.finance import slippage  
from zipline.transforms import MovingAverage  
from zipline.utils.factory import load_bars_from_yahoo  
from zipline.transforms import batch_transform  
import logbook  
log = logbook.Logger('Transform', level=1)

import pytz  
import datetime as dt  
from math import sqrt  
import pandas as pd

window = 89

def endpoints(start, end, period='m') :  
    dates = tradingcalendar.get_trading_days(start, end)

    if isinstance(period, int) :  
        dates = [dates[i] for i in range(0, len(dates), period)]  
    else :  
        if period == 'm' : months = 1  
        elif period == 'q' : months = 3  
        elif period == 'b' : months = 6  
        elif period == 'y' : months = 12  
        monthend_dates = [dates[i - 1] for i in range(1,len(dates))\  
                          if dates[i].month > dates[i-1].month\  
                          or dates[i].year > dates[i-1].year ]+ list([dates[-1]])  
        dates = [monthend_dates[i] for i in range(0,len(monthend_dates),months)]  
    return dates

@batch_transform  
def get_past_prices(data):  
    prices = data['price']  
    return prices

class AllAssetsExceptBonds(TradingAlgorithm):  
    def initialize(self):  
        log.info('Initializing.......')  
        self.get_past_prices = get_past_prices(window_length=window)  
        self.add_transform(MovingAverage, 'mavg', ['price'],  
                           market_aware=True,  
                           window_length=window)  
    #    STOCKS = [ 'SHY', 'CSD', 'IJR', 'MDY', 'PBE', 'GURU', 'EFA', 'EPP', 'EWA', 'IEV', \  
    #              'ADRE', 'DEM', 'RWO', 'RWX', 'GLD']  
        self.stocks = STOCKS

        self.top_n = 2

        # Use the fixed slippage model, which will just add/subtract a specified spread  
        # spread/2 will be added on buys and subtracted on sells per share  
        # commission will be charged per trade  
        self.slippage = slippage.FixedSlippage(spread=0.0)  
        self.commission = PerTrade(10.0)

        # want to rebalance on the last trading day of the month  
        # because of the event-driven nature of zipline, this means  
        # that we need to apply the ranking formula a day earlier  
        start = dt.datetime(2007,1,1, 0, 0, 0, 0, pytz.utc)  
        end = dt.datetime(2014,3,1, 0, 0, 0, 0, pytz.utc)  
        self.event_dates =  endpoints(start, end, period='q' )  
#        log.info('Finished Initializing.......')  
    def handle_data(self, data):

        hist = self.get_past_prices.handle_data(data)

        #circuit breaker in case transform returns none  
        if hist is None:  
            return  
#        log.info('*********************Handle Data')  
        #circuit breaker, only calculate on 2nd last trading day of month  
        if self.get_datetime() in self.event_dates:  
            log.info('\n\nDATE\n\n %s' % self.get_datetime())  
            # for debugging  
            for security in self.stocks:  
                if self.portfolio.positions[security].amount > 0:  
                    log.debug ('{} {}'.format([self.portfolio.positions[security][key]\  
                                for key in [ 'sid', 'amount', 'cost_basis']],\  
                                self.portfolio.portfolio_value))  


            # calculate strategy data  
            d_returns = hist / hist.shift(1) - 1  
            perf_88 = (hist.ix[-1] / hist.ix[0] - 1).fillna(0).replace(inf, 0)  
            log.info('\n\nPRICE\n\n %s' % hist[:5])  
            log.info('\n\nPERF_88\n\n %s' % perf_88)  
            vol_88 = d_returns.std() * sqrt(252)  
            log.info('\n\nVOL_88\n\n %s' % vol_88)  


            rs = (perf_88.rank() * 0.65 + vol_88.rank(ascending=False) * 0.35).fillna(0)  
#            log.info('\n\nPERF_88_MAX\n\n %s' % perf_88.max())  
            rs_modified = rs * 10 + perf_88.max()- perf_88 / perf_88.max()  
            ranks = rs_modified.rank(ascending=False)  
#            log.info('\n\nRS\n\n %s' % rs)  
#            log.info('\n\nRS_MODIFIED\n\n %s' % rs_modified)  
#            log.info('\n\nRANKS\n\n %s' % ranks)

            weights = pd.Series(0., index=ranks.index)  
            for security in ranks.index:  
                if ranks[security] <= self.top_n :  
#                    log.info('\n\nTEST SECURITY %s' % security)  
#                    log.info('\n\nPRICE %s' % data[security]['price'])  
#                    log.info('\n\nMAVG %s' % data[security].mavg['price'])  
                    if ranks[security] < ranks['SHY'] and data[security].price >= data[security].mavg['price'] :  
#                        log.info('\n\nADD SECURITY %s' % security)  
                        weights[security] = weights[security] + 1. / self.top_n  
                    else:  
#                        log.info('\n\nREPLACE %s' % security)  
                        weights['SHY'] = weights['SHY'] + 1. / self.top_n # replace security with cash

#            log.info('\n\nWEIGHTS\n\n %s' % weights)  
            for security in weights.index:  
                if weights[security] == 0. and self.portfolio.positions[security].amount > 0:  
                    log.info('\n\nLIQUIDATE SECURITY %s' % security)  
                    log.info('\n\nCURRENT POSITION\n\n %s' % self.portfolio.positions[security].amount)  
                    self.order_target_percent(security,  0) # close position  
                elif weights[security] > 0:  
                    log.info('\n\nREBALANCE SECURITY %s' % security)  
                    log.info('\n\nSECURITY WEIGHT%s' % weights[security])  
                    self.order_target_percent(security, weights[security])  
            #this is just for debugging purposes  
            for security in weights.index:  
                if weights[security] > 0:  
                    log.debug('{} {}'.format(security, weights[security]))

start = dt.datetime(2007, 1, 1, 0, 0, 0, 0, pytz.utc)  
end = dt.datetime(2014, 3, 1, 0, 0, 0, 0, pytz.utc)

datapath = 'E:\\Temp\\data_all.pkl'

STOCKS = [ 'SHY', 'CSD', 'IJR', 'MDY', 'PBE', 'GURU', 'EFA', 'EPP', 'EWA', 'IEV', \  
              'ADRE', 'DEM', 'RWO', 'RWX', 'GLD']

Algo = AllAssetsExceptBonds()

#this uses zipline to load data  
try :  
    data = pd.read_pickle(datapath)  
except :  
    data = load_bars_from_yahoo(indexes={}, stocks=STOCKS, start=start, end=end)  
    data.to_pickle(datapath)

# this uses finlib to load data  
#data = get_history(STOCKS, start, end,'G:\\Google Drive\\Python Projects\\PyScripter Projects\\Computational Investing\\Data\\')  
#data.major_axis = data.major_axis.tz_localize(pytz.utc)  
#data.minor_axis = np.array(['open', 'high', 'low', 'close', 'volume', 'price'], dtype=object)

# check for data problems  
for key in data.keys() :  
    if data[key].first_valid_index() != data[key].index.min() :  
        print 'WARNING: ',  key, 'only has valid data from ', data[key].first_valid_index()

for symbol in STOCKS :  
    bad_idx = pd.isnull(data[symbol][data[symbol].index >= data[symbol].first_valid_index()]).any(1).nonzero()[0]  
    print symbol, ': bad data len =', len(bad_idx), '===>', bad_idx

# in most cases, just forward fill  
data = data.fillna(method='ffill').fillna(0)

results = Algo.run(data)

Grant Kiehne

Mar 17, 2014

Hi Dave,

It's maybe hair-brained, but I gather from Anony Mole's example (https://www.quantopian.com/posts/trade-at-the-open-slippage-model) that one might be able to specify any price desired, including loading in prices using fetcher. If this works, you could load your Yahoo-derived prices to see if the online backtester yields the same result as the offline zipline.

Perhaps this would help you reconcile understand the difference you are seeing?

Grant

Dave Gilbert

Mar 18, 2014

thanks for the tip Grant. Works like a charm!

Grant Kiehne

Mar 18, 2014

Cool! Glad it worked. --Grant

John Fawcett

Mar 18, 2014

Wow, that is a very impressive hack!

Disclaimer

Nicholas Goodrich

Mar 18, 2014

Clever way of dealing with the issue, though it seems a "buy/sell anything" would just involve a nondefault constructor for a sid which would allow an import of all required data for said "security". Perhaps the issue is the portfolio is actually acting as a real portfolio and buying faux shares of the security; but perhaps that could be dealt with by adding a sort of "dummy portfolio". I am not sure if commodity trading algorithms can be implemented using Quantopian but this could be a way to be able, at least for backtesting purposes, to do that; along with another type of "security" like bitcoins or any altcoin for that matter.

You've successfully submitted a support ticket.

Our support team will be in touch soon.