#92 Control color in seaborn heatmaps

Once you understood how to make a heatmap with seaborn and how to make basic customization, you probably want to control the color palette. This is a crucial step since the message provided by your heatmap can be different following the choice you make. Three options are possible:


  • Sequential palettes translate the value of a variable to the intensity of one color: from bright to dark. You can use this kind of palette when you have, for example, a value going from 0 to 1. Let’s create this kind of data:

    # library
    import seaborn as sns
    import pandas as pd
    import numpy as np
    # Create a dataset (fake)
    df = pd.DataFrame(np.random.random((10,10)), columns=["a","b","c","d","e","f","g","h","i","j"])

    Several sequential palettes are available. Here are 4 examples applied to the df data frame. Find other possibilities here.

    sns.heatmap(df, cmap="YlGnBu")
    sns.heatmap(df, cmap="Blues")
    sns.heatmap(df, cmap="BuPu")
    sns.heatmap(df, cmap="Greens")

    Note that you can control the value to use for the brightest and darkest color. This is possible using the vmin and vmax argument. Check the 2 examples below. On the left, vmax is set to 0.5. Thus, every cell with a value over 0.5 is represented by a purple dark. On the right, every cell < 0.5 is white, and every cell > 0.7 is purple dark.

    sns.heatmap(df, vmin=0, vmax=0.5)
    sns.heatmap(df, vmin=0.5, vmax=0.7)
  • Diverging palettes use 2 contrasting colors. Find palette examples here. For example, let’s create a dataset where values goes from -1 to 1. We probably need to use a color from -1 to 0 and another one from 0 to 1.

    # libraries
    import seaborn as sns
    import pandas as pd
    import numpy as np
    # create dataset
    df = np.random.randn(30, 30)
    # create heatmap
    sns.heatmap(df, cmap="PiYG")

    Here the color change is made on 0. But you can control this value: let’s try with 1 and the default color palette: everything over 1 will be red, everything under 1 will be blue –> almost everything gets blue since data are centered on 0.

    sns.heatmap(df, center=1)
  • The last possibility is to transform your continuous data as categorical data. When making such bins, several possibilities exist: you can put the same amount of observation in each bin, or cut the data in regular steps. Here is an example using the qcut function of panda.

    # library
    import seaborn as sns
    import pandas as pd
    import numpy as np
    # create data
    df = pd.DataFrame(np.random.randn(6, 6))
    # make it discrete
    df_q = pd.DataFrame()
    for col in df:
       df_q[col] = pd.to_numeric( pd.qcut(df[col], 3, labels=list(range(3))) )
    # plot it


Leave a Reply

Your email address will not be published.