Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
How to get hold of open/ close prices from load_from_yahoo ?

Hi,

I am using zipline on my own PC. In Quantopian I see data[SYMBOL].close and data[SYMBOL].open. In zipline I see the caching of the csv files from Yahoo including the open. close values. But for some reason only the SYMBOL and Adj. Close price seem to get out. How is that possible?

I have the feeling that it has to do with the Dictionary and this line: df = pd.DataFrame({key: d[close_key] for key, d in data.iteritems()})

My Python is not well enough to understand what is happening. I have the feeling that only the index (SYMBOL) and 1 price are moved to the df. The other columns of the Yahoo data are not moved.

I tried to use panda.io.data to load Yahoo data as well, but it complains on datetime differences :-(.

Any help appreciated.

J.

The function in question is:

def load_from_yahoo(indexes=None,
stocks=None,
start=None,
end=None,
adjusted=True):
"""
Loads price data from Yahoo into a dataframe for each of the indicated
securities. By default, 'price' is taken from Yahoo's 'Adjusted Close',
which removes the impact of splits and dividends. If the argument
'adjusted' is False, then the non-adjusted 'close' field is used instead.

:param indexes: Financial indexes to load.  
:type indexes: dict  
:param stocks: Stock closing prices to load.  
:type stocks: list  
:param start: Retrieve prices from start date on.  
:type start: datetime  
:param end: Retrieve prices until end date.  
:type end: datetime  
:param adjusted: Adjust the price for splits and dividends.  
:type adjusted: bool

"""  
data = _load_raw_yahoo_data(indexes, stocks, start, end)  
if adjusted:  
    close_key = 'Adj Close'  
else:  
    close_key = 'Close'  
df = pd.DataFrame({key: d[close_key] for key, d in data.iteritems()})  
df.index = df.index.tz_localize(pytz.utc)  
return df

Sample code:

This code has 6 columns including high/ low etc. where zipline only shows Adj. Close

SYMBOL = 'SPY'

start = datetime(2000, 1, 1, 0, 0, 0, 0, pytz.utc)  
end =  datetime(2001, 1, 2, 0, 0, 0, 0, pytz.utc)  
date_range = daterange(start, end)

data = pd.io.data.get_data_yahoo(SYMBOL,start=start, end=end, adjust_price=True)  
data.to_csv('c:\\data\\test.csv', encoding='utf-8')  
data = pd.DataFrame.from_csv('c:\\data\\test.csv', parse_dates=True, encoding='utf-8')  
data.index = data.index.tz_localize(pytz.utc)

print data.head()  
print data[['Open', 'Close']].head()

#close_key = 'Close'  
#df = pd.DataFrame({key: d[close_key] for key, d in data.iteritems()})

#data = load_from_yahoo(stocks={SYMBOL}, indexes={}, start=start, end=end, adjusted=True)  
^-- NOT HAVING close, open
1 response

The DataFrame is a 2D table so it just uses the 'Adj Close' so that it can make a time series with several securities and still have one value per cell. You will have to load the data yourself and integrate it into your back tests to get the other stats for each day. An option is to make time series for each security that has the high/low etc as columns.