Libraries
First, we need to load a few libraries:
- matplotlib: for displaying the chart
- seaborn: for creating the chart
numpy
: for some calculations
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
df = sns.load_dataset('iris')
Violin plot
In the following example, we start from a simple violinplot and add annotations to it.
To do so we:
- calculate the median
sepal_length
for each group and store them in a variable namedmedians
- we then create a
nobs
list which stores the number of observations for each group - eventually, we add labels to our figure.
To add labels, keep in mind that seaborn is built on top of matplotlib, thus seaborn objects can be stored in matplotlib axes or figures (here we store the violinplot in a matplotlib axes object named ax). This enables us to use matplotlib axes .get_xticklabels() as well as .text() functions and its various parameters (horizontalalignment, size, color, weight) to add text to our figure.
# calculate medians and number of observations
medians = df.groupby(['species'])['sepal_length'].median().values
nobs = df['species'].value_counts().values
nobs = [str(x) for x in nobs.tolist()]
nobs = ["n: " + i for i in nobs]
sns.set_theme(style="darkgrid")
ax = sns.violinplot(x="species", y="sepal_length", data=df)
# Add text to the figure
pos = range(len(nobs))
for tick, label in zip(pos, ax.get_xticklabels()):
ax.text(pos[tick], medians[tick] + 2, nobs[tick],
horizontalalignment='center',
size='small',
color='black')
plt.show()
Going further
This post explains how to create and customize a violin plot with the seaborn library.
You might be interested in how adding individual data points in violin plot and how to create a raincloud plot with the ptitprince
library.