A hexbin plot is useful to represent the relationship of 2 numerical variables when you have a lot of data points. Without overlapping of the points, the plotting window is split into several hexbins. The color of each hexbin denotes the number of points in it. This can be easily done using the hexbin()
function of matplotlib. Note that you can change the size of the bins using the gridsize
argument. The parameters of hexbin()
function used in the example are:
x, y
: The data positionsgridsize
: the number of hexagons in the x-direction and the y-direction
# libraries
import matplotlib.pyplot as plt
import numpy as np
# create data
x = np.random.normal(size=50000)
y = (x * 3 + np.random.normal(size=50000)) * 5
# Make the plot
plt.hexbin(x, y, gridsize=(15,15) )
plt.show()
# We can control the size of the bins:
plt.hexbin(x, y, gridsize=(150,150) )
plt.show()
It is possible to change the color palette applied to the plot with the cmap
argument. Read this page to learn more about color palette with matplotlib and pick up the right one.
# libraries
import matplotlib.pyplot as plt
import numpy as np
# create data
x = np.random.normal(size=50000)
y = (x * 3 + np.random.normal(size=50000)) * 5
# Control the color
plt.hexbin(x, y, gridsize=(25,25), cmap=plt.cm.Greens)
plt.show()
# Other color
plt.hexbin(x, y, gridsize=(25,25), cmap=plt.cm.BuGn_r)
plt.show()
Note that you can easily add a color bar beside the plot using colorbar()
function.
# Add a colorbar if necessary
plt.hexbin(x, y, gridsize=(25,25), cmap=plt.cm.Purples_r)
plt.colorbar()
plt.show()