Qgrid - An interactive grid for exploring pandas DataFrames¶

Qgrid is a scrollable grid widget that can be used to edit, sort, and filter DataFrames in Jupyter notebooks. It was developed for use in Quantopian's hosted research environment and also was released as an open source project on GitHub.

Usage & API¶

The primary method in the API is the show_grid method, which is used as follows to show a DataFrame called df:

qgrid.show_grid(df)

The show_grid method returns a QgridWidget instance, which will be rendered as a interactive grid when displayed as the output for a cell.

The QgridWidget instance it returns also has some methods on it that may be useful, such as the get_changed_df method, which allows you to get a copy of the DataFrame that qgrid is managing internally (with filters/sorts/edits applied).

See the API documentation hosted on readthedocs for more information about the QgridWidget class and the other methods available in the API.

The API documentation can also be accessed via the "?" operator in Jupyter notebooks. To use the "?" operator, type the name of the function followed by "?" to see the documentation for that function, like this:

qgrid.show_grid?
qgrid.set_defaults?
qgrid.set_grid_option?
qgrid.enable?
qgrid.disable?

Examples¶

The following examples show qgrid being used to display various DataFrames. Examples 1a and 1b both use Quantopian's pipeline API. Examples 2 and 3 both use dummy data to demonstrate various capabilities of qgrid.

Execute the following cell once to import all of the modules that will be needed in this notebook and do some initial configuration of qgrid.

from quantopian.pipeline import Pipeline
from quantopian.research import run_pipeline
from quantopian.pipeline.filters import StaticSids
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.factors import (
    DailyReturns,
    MarketCap,
    MaxDrawdown,
)
from quantopian.pipeline.data.psychsignal import stocktwits_free
from quantopian.pipeline.factors import SimpleMovingAverage

import numpy as np
import pandas as pd

import qgrid

# set the default max number of rows to 12 so the DataFrame we render with qgrid aren't too tall
qgrid.set_grid_option('maxVisibleRows', 12)

Example 1a - Render a DataFrame returned by Quantopian's pipeline API¶

In the first cell we'll create a Pipeline which includes returns, market cap, and max drawdown for some familiar stocks, and run it using run_pipeline.

In the 2nd and 3rd cells we'll render the DataFrame returned by run_pipeline without qgrid, and then with it, for comparison.

pipe = Pipeline(
    columns={
        'returns': DailyReturns(),
        'market_cap': MarketCap(),
        'max_drawdown': MaxDrawdown([USEquityPricing.close], window_length=7),
    },
    screen=StaticSids([symbols('AAPL'), symbols('MSFT'), symbols('AMZN'), symbols('TSLA')])
)

df = run_pipeline(pipe, '2015-01-01', '2015-04-01')

# convert market_cap to billions for readability
df['market_cap'] = df['market_cap'].map(lambda x: x/(1.0 * 10**9))

# render the DataFrame without using qgrid first
df.head(n=8)

# now render the same DataFrame using qgrid
qgrid.show_grid(df)

Example 1b - Now let's include some sentiment data from StockTwits ¶

In the first cell we'll update the pipeline from the previous example (1a) by adding a couple of columns from the StockTwits data set. In particular a "msg_volume" column will be added which indicates the number of StockTwits messages that were scanned on a particular day, and a "sentiment" column will be added which is calculated as a SimpleMovingAverage of the "bull_minus_bear" column from StockTwits.

The second cell simply runs the updated pipeline that was created in the first cell, but for a larger date range than the previous example, to show that qgrid has no trouble handling a larger DataFrame. Please note that executing this cell could take a minute.

In the 3rd and 4th cells we'll render the DataFrame returned by run_pipeline without qgrid, and then with it, for comparison. Since there are too many columns to fit comfortably in the grid area, we'll use the forceFitColumns option to enable a horizontal scroll bar. We'll also set the show_toolbar option to true, which makes a "full screen" button available.

# calculate 7 day average of bull_minus_bear scores from StockTwits
sentiment_score = SimpleMovingAverage(
    inputs=[stocktwits_free.bull_minus_bear],
    window_length=7,
)

# add the 'msg_volumne' column from StockTwits as well 
# as the sentiment_score factor we just created
pipe.add(stocktwits_free.total_scanned_messages.latest, 'msg_volume')
pipe.add(sentiment_score, 'sentiment')

sentiment_df = run_pipeline(pipe, '2009-01-01', '2011-01-01')

# reorder columns
col_order = [
    'msg_volume',
    'sentiment',
    'returns',
    'market_cap',
    'max_drawdown',
]
sentiment_df = sentiment_df[col_order]
sentiment_df['market_cap'] = sentiment_df['market_cap'].map(lambda x: x/(1.0 * 10**9))
sentiment_df.head()

qgrid.show_grid(sentiment_df, grid_options={'forceFitColumns': False}, show_toolbar=True)

Example 2 - Render a DataFrame with many different types of columns¶

This example constructs a DataFrame of dummy data and renders it using qgrid. The dummy data contains many of the types of columns that qgrid supports, so it's a good way to get an idea of what the filtering and editing controls look like for each type of column.

To test filtering, click the filter icon in any of the column headers of the grid. To test editing, double click the cell you'd like to edit.

randn = np.random.randn
df_types = pd.DataFrame({
    'A' : pd.Series(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
               '2013-01-05', '2013-01-06', '2013-01-07', '2013-01-08', '2013-01-09'],index=list(range(9)),dtype='datetime64[ns]'),
    'B' : pd.Series(randn(9),index=list(range(9)),dtype='float32'),
    'C' : pd.Categorical(["washington", "adams", "washington", "madison", "lincoln","jefferson", "hamilton", "roosevelt", "kennedy"]),
    'D' : ["foo", "bar", "buzz", "bippity","boppity", "foo", "foo", "bar", "zoo"] })
df_types['E'] = df_types['D'] != 'foo'
df_types

qgrid_widget = qgrid.show_grid(df_types, show_toolbar=True)
qgrid_widget

qgrid_widget.get_changed_df()

Example 3 - Render a DataFrame with 1 million rows¶

This example constructs a large DataFrame of dummy data and renders it using qgrid. The purpose of this example is to show that qgrid can handle large DataFrames.

df_scale = pd.DataFrame(np.random.randn(1000000, 4), columns=list('ABCD'))
# duplicate column B as a string column, to test scalability for text column filters
df_scale['B (as str)'] = df_scale['B'].map(lambda x: str(x))
q_scale = qgrid.show_grid(df_scale, show_toolbar=True, grid_options={'forceFitColumns': False, 'defaultColumnWidth': 200})
q_scale

q_scale.get_changed_df()