How to use code like fetcher offline?

Back to Community

posted

Hi,

I would like to do some offline analytics/ plotting on own data and debugging why for instance talib MACD is not working in Quantopian, but offline it does.

I want to reuse most of my code so I can easily code-paste between zipline and the quantopian platform. Since the latest zipline does not include features on Quantopian I am trying to build them myself. However, I am not a Python expert.

How can I wrap the fetcher to a read_csv of pandas? I miss for instance the 'sid' in my dataframe for which the rename_col(data) will not do df = df[[''sid', 'price']]. I am able to rename 'open' to 'price', but how to add the sid?

Any help appreciated or references where to look.

parse = lambda x: datetime.datetime.strptime(x, '%m-%d-%Y')  
page = 'C:\\data\\FDAX-DAILY-19971119-20140301.txt'  
data = read_csv(page, delim_whitespace=True, names=['Date','Open','High','Low', 'Close', 'Volume','OI'], index_col=0, parse_dates=[0], date_parser=parse);  
data = rename_col(data)

7 responses

Quant Trader

Hi,
I mean how to get the 'sid' into next result so that it is compatible with Quantopian?
J.


DatetimeIndex: 4114 entries, 1997-09-11 00:00:00 to 2014-01-03 00:00:00  
Data columns (total 6 columns):  
price     4114  non-null values  
high      4114  non-null values  
low       4114  non-null values  
close     4114  non-null values  
volume    4114  non-null values  
oi        4114  non-null values  
dtypes: float64(4), int64(2)

Thomas Wiecki

Hi,

Are you using zipline but want to use data from a csv file? Zipline questions should be directed to https://groups.google.com/forum/#!forum/zipline

Thomas

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Quant Trader

Hi Thomas,
I believe it is more an interface/ porting to and from question.... and/or 'not knowing how to use Python' question. :-).
J.

Thomas Wiecki

Hi Quant Trader,
We actually thought of porting fetcher to zipline, it shouldn't be that difficult. Allowing to also pass in a local file instead of a URL is a great idea actually. Unfortunately I think that to write code to make the fetcher data available as it is on Quantopian requires some knowledge of zipline internals. But if you want to give it a go we can certainly provide some help! Otherwise you could just wait until we make fetcher available, unfortunately I can't promise on when that will be...
Thomas

Disclaimer

Blue Seahawk

QT> [...] debugging why for instance talib MACD is not working in Quantopian [...]

Yep, appreciate it if a dev at Quantopian could take 4 minutes to check whether that's a quick fix.
Or maybe someone in the community can find a way for MACD from 'talib' (not 'ta') to return something besides 'nan'.

Scott Sanderson

Hi Gary,

As you probably already know, MACD involves computing three moving averages of varying lengths, in your case of lengths 7, 26 and 9. In general, if you ask a TALib function for a moving average of length N, it will give you back NaN for the first N-1 values of the passed array, because the there isn't enough data to compute the average over the whole window length. Thus, for example, doing

talib.EMA(np.arange(5.0, dtype=np.float64), 3)

yields
array([ nan, nan, 1., 2., 3.])

In the case of MACD, TALib computes two exponential moving averages (EMAs), in your case of period-lengths 12 and 26, resulting in 11 and 25 NaN values being generated for those averages.
After computing those values, it then computes a 9 day EMA of the difference between the first two computed arrays. Since NaN + anything = NaN, this results in TALIb computing a 9-period moving average on an array that contains 25 NaNs. Clearly, the first 25 of these averages will also be NaNs. Less obviously, the next 8 results will also be NaNs, since TALib evaluates any average containing a NaN to NaN. Thus, for a call to
talib.MACD(array, 12, 26, 9) the first 33 values will be NaN. You can check this by trying:
talib.MACD(np.arange(34) * 1.0, 12, 26, 9) which yields

(array([ nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  
         7.]),  
 array([ nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  
         7.]),  
 array([ nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  
         0.]))

Surprising (to me at least) is the fact that the first two arrays (which should just contain the 12 and 26 day EMAs), have the same NaN behavior as the final result. Looking through the TALib C source, however, it turns out that their MACD implementation computes a single lookback window to use for all three averages, which results in more NaNs than expected for the two plain EMAs.

Disclaimer

Blue Seahawk

[Edit] @ Scott Sanderson
Ok so the context.prices list needed to be a lot higher than 27 for talib.MACD()
(and then the talib.MACD() values appear to match TradingView for example more closely than ta.MACD() values do)

You've successfully submitted a support ticket.

Our support team will be in touch soon.