Control the bandwidth of a seaborn density plot

logo of a chart:Density

Kde (or density) plots display information in a similar way than histograms, though a Gaussian kernel is used to produce a smoothed line corresponding to observations count.

Just as you could modify histograms bins size in order to categorize data, you can also change the way a kdeplot is built, especially by modifying the smoothness of the line. This post explains how to control the bandwidth parameter of a seaborn density plot.

Libraries

First, we need to import a few libraries:

import matplotlib.pyplot as plt
import seaborn as sns

Dataset

The dataset that we will use is the iris dataset that we can load using seaborn.

df = sns.load_dataset('iris')

Defaut density plot

Density plot differ from histogram in that they are smoothed versions of the histogram.

However, in order to be smoothed, we need to define a bandwidth, which is a parameter that controls the smoothness of the density plot. Varying the bandwidth will give different density plots, and different information too!

In seaborn, it's the bw_method argument that controls it. Here is what the default bandwidth looks like in seaborn:

sns.set_theme(style="darkgrid")
sns.kdeplot(df['sepal_width'], fill=True, color="olive", bw_method=1)
plt.show()

Custom bandwidth

The following density plots have been made using the same data. Only the bandwidth value changes from 1 in the first graph to 0.2 on the right.

This parameter can be of particular interest when a finer understanding of the distribution is needed. It could highlight bimodal distributions more easily and help us in observing patterns that the Gaussian kernel over-smoothed.

Deprecation:

Note that in older version of seaborn (< 0.11.0), the bw parameter was used but is deprecated since and bw_method and bw_adjust have replaced it.

See scipy.stats.gaussian_kde in scipy.org for further details on bw_method and bw_value.

In seaborn 0.11.0 and before versions, you would use sns.kdeplot(df\['sepal_width'\], shade=True, bw=0.05, color='olive')

Now, shade and bw arguments are deprecated.

sns.set_theme(style="darkgrid")
sns.kdeplot(df['sepal_width'], fill=True, color='olive', bw_method=0.08)
plt.show()

Going further

This post explains how to control smoothing in a density plot with seaborn.

You might be interested in displaying distribution of multiple variables and creating a mirrored density plot.

Contact & Edit


👋 This document is a work by Yan Holtz. You can contribute on github, send me a feedback on twitter or subscribe to the newsletter to know when new examples are published! 🔥

This page is just a jupyter notebook, you can edit it here. Please help me making this website better 🙏!