Libraries & Dataset
Let's start by import a few libraries and create a dataset:
# libraries
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# create data
size = 100000
df = pd.DataFrame({
'x': np.random.normal(size=size),
'y': np.random.normal(size=size)
})
df.head()| x | y | |
|---|---|---|
| 0 | 0.156635 | 0.497530 |
| 1 | -0.485384 | -1.329300 |
| 2 | -1.116573 | 1.873535 |
| 3 | 0.841880 | 0.375499 |
| 4 | -0.528407 | -1.696453 |
2D histograms
2D histograms are useful when you need to analyse the relationship between 2 numerical variables that have a huge number of values. It is useful for avoiding the over-plotted scatterplots.
The following example illustrates the importance of the bins argument. You can explicitly tell how many bins you want for the X and the Y axis.
The parameters of hist2d() function used in the example are:
x, y: input valuesbins: the number of bins in each dimensioncmap: colormap
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(8,8))
# Big bins
axs[0,0].hist2d(x, y, bins=(50, 50), cmap=plt.cm.jet)
axs[0, 0].set_title('bins = (50, 50)')
# Small bins
axs[0,1].hist2d(x, y, bins=(600, 600), cmap=plt.cm.jet)
axs[0, 1].set_title('bins = (600, 600)')
# If you do not set the same values for X and Y, the bins won't be a square!
axs[1,0].hist2d(x, y, bins=(600, 30), cmap=plt.cm.jet)
axs[1, 0].set_title('bins = (600, 30)')
# If you do not set the same values for X and Y, the bins won't be a square!
axs[1,1].hist2d(x, y, bins=(30, 600), cmap=plt.cm.jet)
axs[1, 1].set_title('bins = (30, 600)')
plt.show()Colors
Once you decide the bin size, it is possible to change the colour palette. Matplolib provides a whole bunch of pre-defined color map (also know as cmap).
Here you can find how to use them in a 2d histogram:
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(8,8))
# Big bins
axs[0,0].hist2d(x, y, bins=(50, 50), cmap=plt.cm.Reds_r)
axs[0, 0].set_title('cmap=plt.cm.Reds')
# Small bins
axs[0,1].hist2d(x, y, bins=(50, 50), cmap=plt.cm.Blues_r)
axs[0, 1].set_title('cmap=plt.cm.Blues')
# If you do not set the same values for X and Y, the bins won't be a square!
axs[1,0].hist2d(x, y, bins=(50, 50), cmap=plt.cm.Greens_r)
axs[1, 0].set_title('cmap=plt.cm.Greens')
# If you do not set the same values for X and Y, the bins won't be a square!
axs[1,1].hist2d(x, y, bins=(50, 50), cmap=plt.cm.Greys_r)
axs[1, 1].set_title('cmap=plt.cm.Greys')
plt.show()Colorbar
Finally, it might be useful to add a color bar on the side as a legend. You can add a color bar using colorbar() function.
plt.hist2d(x, y, bins=(50, 50), cmap=plt.cm.Greys_r)
plt.colorbar()
plt.show()Going further
You might be interested:
- how to create a contour plot, which is a smoothed version of the 2d histogram
- how to combine a 2d density/histogram plot with marginal plot






