Pandas: groupby plotting and visualization in Python

In this data visualization recipe we’ll learn how to visualize grouped data using the Pandas library as part of your Data wrangling workflow.

Data acquisition

We’ll start by creating representative data. Copy the code below and paste it into your notebook:

# Python3
# Import Pandas
import pandas as pd

# Create Dataframe
budget = pd.DataFrame({"quarter": [1, 3, 2, 4, 1, 4, 2, 2],
                       "area":['North', 'South', 'West','Midwest']* 2,
                     "target": [6734, 7265, 1466, 5426, 6578, 9322, 2685, 1769]})
budget.head()

Here’s our DataFrame header:

quarterareatarget
01North6734
13South7265
22West1466
34Midwest5426
41North6578

Plot groupby in Pandas

Let’s first go ahead a group the data by area

sales_by_area = budget.groupby('area').agg(sales_target =('target','sum'))

Here’s the resulting new DataFrame:

sales_by_area

sales_target
area
Midwest7195
North13312
South16587
West4151

Groupby pie chart

We’ll use the DataFrame plot method and puss the relevant parameters. Note the usage of the optional title , cmap (colormap), figsize and autopct parameters.

  • title assigns a title to the chart
  • cmap assigns a color scheme map.
  • figsize: determines the width and height of the plot.
  • autopct helps us to format the values as floating numbers representing the percentage of the total.
sales_by_area.plot(kind='pie', x='area', y='sales_target', title = 'Sales by Zone', 
cmap='Dark2', autopct="%.1f%%", figsize = (10,6), legend=False);

Here’s the resulting plot:

Groupby barplot

A similar example, this time using the barplot. Here’s the code that we’ll be using.

sales_by_area.plot(kind='bar', title = 'Sales by Zone', figsize = (10,6), cmap='Dark2', rot = 30);

Note the legend that is added by default to the chart. Also worth noting is the usage of the optional rot parameter, that allows to conveniently rotate the tick labels by a certain degree. In our case – 30.

Here’s the resulting chart:

Groupby Histogram

We are able to quickly plot an histagram in Pandas. Note the usage of kind=’hist’ as a parameter into the plot method:

sales_by_area.plot(kind='hist', title = 'Sales by Zone', figsize = (10,6), cmap='Dark2', rot = 30);

Leave a Comment