Notebook

This notebook shows how to plot multiindexed data (like the pipeline output for multiple assets) in three different ways.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from quantopian.pipeline import Pipeline, CustomFactor
from quantopian.pipeline.data import Fundamentals
from quantopian.research import run_pipeline

from quantopian.pipeline.filters import StaticAssets
from quantopian.research import symbols
from quantopian.pipeline.data import USEquityPricing

import matplotlib.pyplot as plt
import pandas as pd
In [2]:
assets_screen = StaticAssets(symbols(['T', 'VZ']))
fundamentals = {
            'net_income': Fundamentals.net_income_continuous_operations.latest,
            'operating_gains_losses': Fundamentals.operating_gains_losses.latest,
            'operating_income': Fundamentals.operating_income.latest,
            'operating_expenses': Fundamentals.operating_expense.latest
    
                }

my_pipeline = Pipeline(columns=fundamentals, screen=assets_screen)
data = run_pipeline(my_pipeline, '2002-1-01', '2018-6-27', 90)
In [3]:
data.head()
Out[3]:
net_income operating_expenses operating_gains_losses operating_income
2002-01-02 00:00:00+00:00 Equity(6653 [T]) 2.072000e+09 2.200000e+09 -219000000.0 2.822000e+09
Equity(21839 [VZ]) 1.883000e+09 3.402000e+09 -29000000.0 3.677000e+09
2002-01-03 00:00:00+00:00 Equity(6653 [T]) 2.072000e+09 2.200000e+09 -219000000.0 2.822000e+09
Equity(21839 [VZ]) 1.883000e+09 3.402000e+09 -29000000.0 3.677000e+09
2002-01-04 00:00:00+00:00 Equity(6653 [T]) 2.072000e+09 2.200000e+09 -219000000.0 2.822000e+09

1. Easy-to-understand example

In [4]:
# Slice dataframe to get values for each stock
idx = pd.IndexSlice
T_data = T_net_income = data.loc[idx[:, symbols('T')], :]
VZ_data = VZ_net_income = data.loc[idx[:, symbols('VZ')], :]
In [5]:
T_data.head()
Out[5]:
net_income operating_expenses operating_gains_losses operating_income
2002-01-02 00:00:00+00:00 Equity(6653 [T]) 2.072000e+09 2.200000e+09 -219000000.0 2.822000e+09
2002-01-03 00:00:00+00:00 Equity(6653 [T]) 2.072000e+09 2.200000e+09 -219000000.0 2.822000e+09
2002-01-04 00:00:00+00:00 Equity(6653 [T]) 2.072000e+09 2.200000e+09 -219000000.0 2.822000e+09
2002-01-07 00:00:00+00:00 Equity(6653 [T]) 2.072000e+09 2.200000e+09 -219000000.0 2.822000e+09
2002-01-08 00:00:00+00:00 Equity(6653 [T]) 2.072000e+09 2.200000e+09 -219000000.0 2.822000e+09
In [6]:
VZ_data.head()
Out[6]:
net_income operating_expenses operating_gains_losses operating_income
2002-01-02 00:00:00+00:00 Equity(21839 [VZ]) 1.883000e+09 3.402000e+09 -29000000.0 3.677000e+09
2002-01-03 00:00:00+00:00 Equity(21839 [VZ]) 1.883000e+09 3.402000e+09 -29000000.0 3.677000e+09
2002-01-04 00:00:00+00:00 Equity(21839 [VZ]) 1.883000e+09 3.402000e+09 -29000000.0 3.677000e+09
2002-01-07 00:00:00+00:00 Equity(21839 [VZ]) 1.883000e+09 3.402000e+09 -29000000.0 3.677000e+09
2002-01-08 00:00:00+00:00 Equity(21839 [VZ]) 1.883000e+09 3.402000e+09 -29000000.0 3.677000e+09
In [7]:
# Construct our plot
dates = data.index.levels[0] # x-coordinates
plt.plot(dates, T_data['net_income'], label ='T net_income')
plt.plot(dates, T_data['operating_expenses'], label='T operating_expenses')
plt.plot(dates, T_data['operating_gains_losses'], label='T operating_gains_losses')
plt.plot(dates, T_data['operating_income'], label='T operating_income')
plt.plot(dates, VZ_data['net_income'], label='VZ net_income')
plt.plot(dates, VZ_data['operating_expenses'], label='VZ operating_expenses')
plt.plot(dates, VZ_data['operating_gains_losses'], label='VZ operating_gains_losses')
plt.plot(dates, VZ_data['operating_income'], label='VZ operating_income')

# Add labels
plt.legend()
plt.title('Historical T CashFlow (in billions)')
plt.show()

2. Similar idea, cleaner code

In [8]:
# This constructs the same plot, but using cleaner code
columns = list(data.columns)
dates = data.index.levels[0]

plt.clf() # clears previous plot

# Iterate through columns and plot each one
for column in columns:
    plt.plot(dates, T_data[column], label='T '+str(column))
    plt.plot(dates, VZ_data[column], label='VZ '+str(column))
    
plt.legend()
plt.title('Historical T CashFlow (in billions)')
plt.show()

3. Using QGrid

In [9]:
import qgrid

# Get data in Qgrid
qgrid_widget = qgrid.show_grid(data)
qgrid_widget

Failed to display Jupyter Widget of type QgridWidget.

If you're reading this message in the Jupyter Notebook or JupyterLab Notebook, it may mean that the widgets JavaScript is still loading. If this message persists, it likely means that the widgets JavaScript library is either not installed or not enabled. See the Jupyter Widgets Documentation for setup instructions.

If you're reading this message in another frontend (for example, a static rendering on GitHub or NBViewer), it may mean that your frontend doesn't currently support widgets.

In [15]:
# Filter the second index column to show only rows for T,
# then save the changed dataframe
T_data = qgrid_widget.get_changed_df()
In [16]:
# Filter the second index column to show only rows for VZ,
# then save the changed dataframe
VZ_data = qgrid_widget.get_changed_df()
In [17]:
T_data.head()
Out[17]:
net_income operating_expenses operating_gains_losses operating_income
2002-01-02 00:00:00+00:00 Equity(6653 [T]) 2.072000e+09 2.200000e+09 -219000000.0 2.822000e+09
2002-01-03 00:00:00+00:00 Equity(6653 [T]) 2.072000e+09 2.200000e+09 -219000000.0 2.822000e+09
2002-01-04 00:00:00+00:00 Equity(6653 [T]) 2.072000e+09 2.200000e+09 -219000000.0 2.822000e+09
2002-01-07 00:00:00+00:00 Equity(6653 [T]) 2.072000e+09 2.200000e+09 -219000000.0 2.822000e+09
2002-01-08 00:00:00+00:00 Equity(6653 [T]) 2.072000e+09 2.200000e+09 -219000000.0 2.822000e+09
In [18]:
VZ_data.head()
Out[18]:
net_income operating_expenses operating_gains_losses operating_income
2002-01-02 00:00:00+00:00 Equity(21839 [VZ]) 1.883000e+09 3.402000e+09 -29000000.0 3.677000e+09
2002-01-03 00:00:00+00:00 Equity(21839 [VZ]) 1.883000e+09 3.402000e+09 -29000000.0 3.677000e+09
2002-01-04 00:00:00+00:00 Equity(21839 [VZ]) 1.883000e+09 3.402000e+09 -29000000.0 3.677000e+09
2002-01-07 00:00:00+00:00 Equity(21839 [VZ]) 1.883000e+09 3.402000e+09 -29000000.0 3.677000e+09
2002-01-08 00:00:00+00:00 Equity(21839 [VZ]) 1.883000e+09 3.402000e+09 -29000000.0 3.677000e+09
In [19]:
# Now plot using the same code from above
columns = list(data.columns)
dates = data.index.levels[0]

plt.clf() # clears previous plot

# Iterate through columns and plot each one
for column in columns:
    plt.plot(dates, T_data[column], label='T '+str(column))
    plt.plot(dates, VZ_data[column], label='VZ '+str(column))
    
plt.legend()
plt.title('Historical T CashFlow (in billions)')
plt.show()