A heatmap is a graphical representation of data where each value of a matrix is represented as a color. This page explains how to build a heatmap with Python, with an emphasis on the Seaborn library.

Seaborn logoHeatmap with Seaborn

Seaborn is a python library allowing to make better charts easily thanks to its heatmap() function. This section starts with a post describing the basic usage of the function based on any kind of data input. Next it will guide you through the different ways to customize the chart, like controling color and data normalization.

⚠️ Python heatmap and normalization

Consider the left heatmap below. The second column from the left (variable 1) has very high values compared to others. As a result, the variation existing in other variables is hidden.

Highlighting the variable 1 can be the main message of your chart. But if you're interested in other variable variations as well, you probably want to apply some normalization as shown on the right heatmap.

If you want to no more about normalization, check data-to-viz.com. If you want some python code to do it, it's here.

❄ Python, Heatmap and Clustering

It is very common to apply some clustering techniques on a heatmap. The idea is to group items that have the same kind of pattern for their numeric variables. 💡

Usually, it is recommended to display a dendrogram on top of the heatmap to explain how the clusterization has been performed. Last but not least, it can be useful to compare the grouping we got with an expected structure, shown as an additional color.

A seaborn heatmap with clusterization and dendrogram applied

A seaborn heatmap with clusterization and dendrogram applied


👋 This document is a work by Yan Holtz. You can contribute on github, send me a feedback on twitter or subscribe to the newsletter to know when new examples are published! 🔥