Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Stuck on first algorithm

I'm very new to Quantopian and I'm running through the tutorials, reference guides, and lectures but I feel like I'm not quite understanding the pipeline. I'm wanting to create a ratio between the last 10 days of volume for each asset in QTradableUS universe compared to SPY. I then want to create a score based on the delta between a ratio based only on yesterday.

(last_10_days_spy_sma/last_10_days_each_sma) - (yesterday_spy_sma/yesterday_each_sma)

Most of my struggles are because I am wanting to look at volume, and not AverageDollarVolume. When I begin to work with equity objects in the pipeline and not Factors, I begin to run into issues such as Equity Objects not being iterable and nested windowed expressions.

I guess I would like to look at this problem from a clean slate and make sure I understand the root of my problems.

How should I go about creating a ratio of volumes between SPY and all assets in a pipeline that can be compared over 2 different window lengths?

Thank you

7 responses

So after some deliberation, it probably just makes more sense to use AverageDollarVolume since it would be invariant across split adjustments and I don't believe volume would be.

Pipeline at first can be a bit to wrap one's head around. The first thing to understand is you are defining the columns of a dataframe. The columns you define are ostensibly data which your algo requires for its decision making. It's no more than getting all the data in one place. Now, by using filters, one can and often does implement some logic within the pipeline definition. This is ok, but first and foremost get the data. There is a forum post here which may provide some better background on pipeline.

So, data. The first question then is what data do you need. Two pieces of data have been alluded to - volume and dollar-volume. No problem, the pipeline definition for both are very similar. Oh, almost forgot to mention. It is VERY strongly recommended to create pipeline definitions in the notebook environment. It's faster, but moreover, the output can be easily visualized. Pipeline definitions, along with custom factors, can be copied from a notebook into the IDE and work the same.

The 10 SMA and latest values for volume are pretty easy

# The datasets and the geo domain we want to use  
from quantopian.pipeline.data import EquityPricing  
from quantopian.pipeline.domain import US_EQUITIES

# Use the SimpleMovingAverage factor to calculate the 10 day SMA. It will calculate for all securities  
last_10_days_each_sma = SimpleMovingAverage(inputs=[EquityPricing.volume], window_length=10)

# Use the latest method to just get the latest value  
yesterday_each_sma = EquityPricing.volume.latest

The one confounding detail is we want to get these values now just for SPY. Typically, the output of a factor provides a value for all securities. It's kinda a series. When one performs math on factors, the values for each security are matched. To pass the value of SPY to all securities one must first 'slice' the factor to obtain just the SPY value. Then, one must 'broadcast' that value to all securities. There's more detail in the docs here. The code uses brackets notation like this.

last_10_days_spy_sma = last_10_days_each_sma[symbols('SPY')]  
yesterday_spy_sma = yesterday_each_sma[symbols('SPY')]

Now that we have our basic factors, we can create some calculated factors based on these.

# We can now find the ratio. For technical reasons the first operand must be a factor and not a slice  
ratio_10_days = last_10_days_each_sma / last_10_days_spy_sma  
ratio_1_days = yesterday_each_sma / yesterday_spy_sma

Generally, slices and full factors can be used together (factors internally broadcast the slice data to all assets) however, slices are sort of second class citizens. In this case, a factor knows what to do with a slice, but a slice doesn't know what a factor is. The order is important. Put the factor first.

A pipeline can be set up using the dollar-volume in much the same way. Check out the attached notebook.

My personal opinion is dollar-volume is the way to go. Investors invest money not shares. By the way, both dollar-volume and volume are adjusted for splits so no need to worry there. Actually, my advice would be to not use SPY volume but rather the total dollar-volume of all shares in your universe. If thats something you want to pursue it will require a small custom factor.

Good luck.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Here's the notebook.

Dan,

Thank you so much for the detailed response. You have made this much easier to understand. I believe I made a big mistake in creating a custom factor as I was under the assumption that I had to create a custom factor because the volume factor did not exist. When I tried to slice the datapoints for SPY inside of the custom factor, I ran into issues that clearly could have been avoided.

I'm not one to ask for help without losing some hair and sleep over the issue so I am incredibly grateful. Thank you.

I do have one small question about your notebook if you would be so kind. I noticed you mask the SimpleMovingAverage factor, but don't mask the Latest factor (maybe you can't), and then you mask the pipeline.

Are these masks on the pipeline and SimpleMovingAverage redundant? Maybe it's just more efficient?

Thanks

Typically a mask is used for methods like zscore or rank. That limits the universe that the method is looking at. Consider rank. A stock may be ranked 500 within the QTradableStocks universe but 1 in the Q500US. Use a mask to define the 'ranking universe'.

However, the mask parameter is generally used to speed up calculations. Only stocks that pass the mask filter are passed to the factor. Speed is generally not an issue with pipelines, but if one has a custom factor with an involved compute function, it may be desirable to use a mask.

I just used a mask out of habit. Not really needed in this case (through maybe marginally faster). You are correct the latest method doesn't take any parameters so one cannot use a mask. The filter which is most important is the screen applied to the final pipeline. That definitely limits which securities are passed to the algo.

Hope that makes sense.

So since my last post, I've been trying to migrate this notebook into the IDE and make it work. It has been a rough go but I finally found my issue but I'm not quite sure of the "why" yet.

While I was aware of the "symbols" function issue between IDE and research, it was an awfully sneaky/devilish line that was causing me a lot of pain.

When I had switched the code over to using symbols('Ticker') from symbols(['Ticker']), the slices would no longer work because of this error.
"TypeError: Term.getitem() expected a value of type Asset for argument 'key', but got list instead."

Since "symbols" returns a list while "symbol" returns a security object, I switched all of the instances of "symbols" over to using "symbol" instead. This included the spy filter being created at the beginning of the function. This was the next error I received.

"TypeError: 'zipline.assets._assets.Equity' object is not iterable There was a runtime error on line 35."

This took a very long time to dig through because the error was raised when the algo was attaching the pipeline. It turns out that the spy filter at the top which used StaticAssets needed an iterable by using "symbols" function instead of the "symbol" function, which I had just switched it to, doh.

Dan, thank you for your help again. I hope what I've learned here tonight/this morning can help some other fellow noobs like myself.