Once you understood how to make a heatmap with seaborn and how to make basic customization, you probably want to control the color palette. This is a crucial step since the message provided by your heatmap can be different following the choice you make. Three options are possible:
Sequential palettes translate the value of a variable to the intensity of one color: from bright to dark. You can use this kind of palette when you have, for example, a value going from 0 to 1. Let’s create this kind of data:
# library import seaborn as sns import pandas as pd import numpy as np # Create a dataset (fake) df = pd.DataFrame(np.random.random((10,10)), columns=["a","b","c","d","e","f","g","h","i","j"])
Several sequential palettes are available. Here are 4 examples applied to the df data frame. Find other possibilities here.
sns.heatmap(df, cmap="YlGnBu") sns.heatmap(df, cmap="Blues") sns.heatmap(df, cmap="BuPu") sns.heatmap(df, cmap="Greens") #sns.plt.show()
Note that you can control the value to use for the brightest and darkest color. This is possible using the vmin and vmax argument. Check the 2 examples below. On the left, vmax is set to 0.5. Thus, every cell with a value over 0.5 is represented by a purple dark. On the right, every cell < 0.5 is white, and every cell > 0.7 is purple dark.
sns.heatmap(df, vmin=0, vmax=0.5) sns.heatmap(df, vmin=0.5, vmax=0.7)
Diverging palettes use 2 contrasting colors. Find palette examples here. For example, let’s create a dataset where values goes from -1 to 1. We probably need to use a color from -1 to 0 and another one from 0 to 1.
# libraries import seaborn as sns import pandas as pd import numpy as np # create dataset df = np.random.randn(30, 30) # create heatmap sns.heatmap(df, cmap="PiYG")
Here the color change is made on 0. But you can control this value: let’s try with 1 and the default color palette: everything over 1 will be red, everything under 1 will be blue –> almost everything gets blue since data are centered on 0.
The last possibility is to transform your continuous data as categorical data. When making such bins, several possibilities exist: you can put the same amount of observation in each bin, or cut the data in regular steps. Here is an example using the qcut function of panda.
# library import seaborn as sns import pandas as pd import numpy as np # create data df = pd.DataFrame(np.random.randn(6, 6)) # make it discrete df_q = pd.DataFrame() for col in df: df_q[col] = pd.to_numeric( pd.qcut(df[col], 3, labels=list(range(3))) ) # plot it sns.heatmap(df_q) #sns.plt.show()