Libraries & Dataset

Let's start by import a few libraries and create a dataset:

# libraries
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
 
# create data
size = 100000
df = pd.DataFrame({
   'x': np.random.normal(size=size),
   'y': np.random.normal(size=size)
})
df.head()
x y
0 0.156635 0.497530
1 -0.485384 -1.329300
2 -1.116573 1.873535
3 0.841880 0.375499
4 -0.528407 -1.696453

2D histograms

2D histograms are useful when you need to analyse the relationship between 2 numerical variables that have a huge number of values. It is useful for avoiding the over-plotted scatterplots.

The following example illustrates the importance of the bins argument. You can explicitly tell how many bins you want for the X and the Y axis.

The parameters of hist2d() function used in the example are:

  • x, y: input values
  • bins: the number of bins in each dimension
  • cmap : colormap
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(8,8))

# Big bins
axs[0,0].hist2d(x, y, bins=(50, 50), cmap=plt.cm.jet)
axs[0, 0].set_title('bins = (50, 50)')
 
# Small bins
axs[0,1].hist2d(x, y, bins=(600, 600), cmap=plt.cm.jet)
axs[0, 1].set_title('bins = (600, 600)')
 
# If you do not set the same values for X and Y, the bins won't be a square!
axs[1,0].hist2d(x, y, bins=(600, 30), cmap=plt.cm.jet)
axs[1, 0].set_title('bins = (600, 30)')

# If you do not set the same values for X and Y, the bins won't be a square!
axs[1,1].hist2d(x, y, bins=(30, 600), cmap=plt.cm.jet)
axs[1, 1].set_title('bins = (30, 600)')

plt.show()

Colors

Once you decide the bin size, it is possible to change the colour palette. Matplolib provides a whole bunch of pre-defined color map (also know as cmap).

Here you can find how to use them in a 2d histogram:

fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(8,8))

# Big bins
axs[0,0].hist2d(x, y, bins=(50, 50), cmap=plt.cm.Reds_r)
axs[0, 0].set_title('cmap=plt.cm.Reds')
 
# Small bins
axs[0,1].hist2d(x, y, bins=(50, 50), cmap=plt.cm.Blues_r)
axs[0, 1].set_title('cmap=plt.cm.Blues')
 
# If you do not set the same values for X and Y, the bins won't be a square!
axs[1,0].hist2d(x, y, bins=(50, 50), cmap=plt.cm.Greens_r)
axs[1, 0].set_title('cmap=plt.cm.Greens')

# If you do not set the same values for X and Y, the bins won't be a square!
axs[1,1].hist2d(x, y, bins=(50, 50), cmap=plt.cm.Greys_r)
axs[1, 1].set_title('cmap=plt.cm.Greys')

plt.show()

Colorbar

Finally, it might be useful to add a color bar on the side as a legend. You can add a color bar using colorbar() function.

plt.hist2d(x, y, bins=(50, 50), cmap=plt.cm.Greys_r)
plt.colorbar()
plt.show()

Going further

You might be interested:

Contact & Edit


👋 This document is a work by Yan Holtz. You can contribute on github, send me a feedback on twitter or subscribe to the newsletter to know when new examples are published! 🔥

This page is just a jupyter notebook, you can edit it here. Please help me making this website better 🙏!