Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Normalizing positive and negative values separately

Pipeline values are often both positive and negative. Fed to optimize, positives become long and negatives shorted.
When using TargetWeights instead of MaximizeAlpha, there are various ways to normalize them for a target leverage of 1.0.
Decided to toss this out there as I've been using it quite a bit.

def before_trading_start(context, data):  
    context.out = pipeline_output('pipe')  
    context.alpha = norm(context, context.out.alpha)

def norm(c, d):    # d data, it's a series, normalize it pos, neg separately  
    do_demean = 1                           # centering all values around 0  
    preserve_zero_values = 1                # change to 0 if incoming zero-weights should simply be dropped.  
    trim_pos_neg_to_same_number_each = 1    # same number of stocks for positive & negative

    if not len(d): return d   # In case empy.  
    d = d[ d == d ]           # Insure no nans.

    if do_demean:   # If all pos or neg, shift for both pos & neg.  
        if d.min() >= 0 or d.max() <= 0:  
            d -= d.mean()

    zeros = None  
    if preserve_zero_values:  
        zeros = d[ d == 0 ]

    pos = d[ d > 0 ]  
    neg = d[ d < 0 ]

    if trim_pos_neg_to_same_number_each:  
        num  = min(len(pos), len(neg))  
        pos  = pos.sort_values(ascending=False).head(num)  
        neg  = neg.sort_values(ascending=False).tail(num)

    pos /=   pos.sum()  
    neg  = -(neg / neg.sum())  
    ret  = pos.append(neg)

    if preserve_zero_values and zeros is not None:  
        ret = ret.append(zeros)

    return ret  

[edited]

3 responses

Hey Blue, I was scrolling through the forum and came across this post and wanted to see your thoughts. Lately, I've been experimenting with some ML and noticed that the outputs are either 0 or 1. Typically if normalizing your pipeline its separating pos (long) / neg (short). But in a case where my outputs into the optimizer are either 0 or 1 is normalizing not needed? Or is the optimizer smart enough to order by going long all 1s and short all 0s?

That's a case I had not thought of. They definitely do need to be normalized though, or those that are 1 would be telling Opt to try to order the full portfolio value for all of them, except other constraints can change that. The 0's would be telling Opt to try to close those stocks.

You can use the debugger to see what norm() is doing along the way. Set a breakpoint on its first line if not len(d) ...by clicking on the line number in the margin and run it. It will take longer to start and then when it breaks in (stops on that line), in the console area that appears, type, for example (followed by [Enter]):

len(d)  
d  

The latter will display a preview of the data but know that if it is very large the debugger will crash. If that's the case you could temporarily trim back the number of stocks.

Use an arrow button to step forward and when you have passed the line neg = neg.sort_values(ascending=False).tail(num), then type pos and neg.

The line d -= d.mean() would have been shifting your values to +/- .5. I made some changes for more flexibility.

Hope you'll let me know how things go. I'm not sure what your 0 values are intended to do, whether close or maybe it is a signal saying you want to short them, and I've been thinking that might be the case as I wrote this. Negative values to optimize become shorted stocks. You should wind up with a lot of little values from this both positive and negative that all total up to zero. All of the positive values by themselves totaling 1, all of the negative values totaling -1.

[edited]

Thanks Blue! I appreciate your suggestions and I definitely will keep you posted.

I'm not sure what your 0 values are intended to do, whether close or maybe it is a signal saying you want to short them, and I've been thinking that might be the case as I wrote this.

Yes, you are correct in that in this instance 0 would be a signal to short vs 1 signaling buy.

I'll try to look for a simple example of what I remember seeing in the research env. As I am trying to recollect, I could be mistaking seeing 0 or 1 as a custom factor's output producing either 0 or 1 versus a pipeline output into the optimizer... I'll do some digging.