A violin plot is a graphical representation employed to depict the distribution of a dataset, showcasing important statistics like the median, quartiles, and possible outliers. It offers a succinct summary of the central tendency and dispersion of the data.

In this post, we will explore how to use Seaborn to create a violin plot with 2 levels of hierarchy, meaning both groups and subgroups are available for each observation in the dataset.


First, you need to install the following librairies:

# libraries
import seaborn as sns
import matplotlib.pyplot as plt


The dataset we will use is the tips dataset, which we can easily download with seaborn:

In this dataset each row is a tip. For each tip we know the amount of the bill (total_bill) and several information like the day of the week (day) and if the customer was a smoker or not (smoker)

df = sns.load_dataset('tips')

Grouped violin plot

In order to create a violin plot with seaborn, we need to add to the hue argument the name of the categorical variable that will separate our subgroups.

# Grouped violinplot
sns.violinplot(x="day", y="total_bill", hue="smoker", data=df, palette="Pastel1")

Inverting categorical variables

A potentially relevant thing to do is to invert the categorical variable on the x-axis and the one in the hue argument. The output graph will be quite different, and may make it easier to understand the distribution of the underlying variables.

# Grouped violinplot
sns.violinplot(x="smoker", y="total_bill", hue="day", data=df, palette="Pastel1")

Going further

This post explains how to create a grouped violin plot with seaborn.

