Combine a choropleth map and a barplot in Python

logo of a chart:Choropleth

This post demonstrates how to integrate a choropleth map with a barplot. You'll learn to create each chart individually and then seamlessly combine them.

It offers clear explanations with reproducible, step-by-step code examples.

About

This plot is a choropleth map combined with a barplot for the legend.

It has been originally designed by Vinicius Oike Reginatto in R. Here is a reproduction in Python by Joseph Barbier.

As a teaser, here is the plot we’re gonna try building:

Libraries

For creating this chart, we will need a whole bunch of libraries!

import geopandas as gpd
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
from pypalettes import load_cmap
from highlight_text import fig_text, ax_text
from pyfonts import load_font

Dataset

We need two datasets:

  • one for the map with the polygons and the measure of interest (in this case the HDI)
  • one for the bar chart, which we will deduce from the first by doing a few calculations.
path = "https://raw.githubusercontent.com/holtzy/The-Python-Graph-Gallery/master/static/data/saopaulo.geojson"
path = "../../static/data/saopaulo.geojson"
df = gpd.read_file(path)
df.head()
code_udh HDI pop geometry
0 1355030801001 0.866 20388.0 POLYGON ((-46.53789 -23.56245, -46.53799 -23.5...
1 1355030801002 0.870 21937.0 POLYGON ((-46.54193 -23.54086, -46.54212 -23.5...
2 1355030801003 0.790 10536.0 POLYGON ((-46.51514 -23.55757, -46.51534 -23.5...
3 1355030801004 0.816 27925.0 POLYGON ((-46.53260 -23.57370, -46.53276 -23.5...
4 1355030801005 0.820 6817.0 POLYGON ((-46.50676 -23.57095, -46.50697 -23.5...

Now we create the dataset for the barplot:

atlas = df[["HDI", "pop"]].copy()

bins = [0.0, 0.65, 0.699, 0.749, 0.799, 0.849, 0.899, 0.949, 1.0]
labels = [
    "0.650 or less",
    "0.650 to 0.699",
    "0.700 to 0.749",
    "0.750 to 0.799",
    "0.800 to 0.849",
    "0.850 to 0.899",
    "0.900 to 0.949",
    "0.950 or more",
]

atlas["group_hdi"] = pd.cut(atlas["HDI"], bins=bins, include_lowest=True, labels=labels)

pop_hdi = atlas.groupby("group_hdi", observed=True)["pop"].sum().reset_index()
pop_hdi["share"] = (pop_hdi["pop"] / pop_hdi["pop"].sum()) * 100

pop_hdi["y_text"] = pop_hdi["share"] / 2
pop_hdi["label"] = pop_hdi["share"].round(1).astype(str) + "%"

pop_hdi
group_hdi pop share y_text label
0 0.650 or less 415953.0 3.710661 1.855331 3.7%
1 0.650 to 0.699 1620623.0 14.457362 7.228681 14.5%
2 0.700 to 0.749 2106520.0 18.791984 9.395992 18.8%
3 0.750 to 0.799 2460948.0 21.953789 10.976895 22.0%
4 0.800 to 0.849 1576054.0 14.059768 7.029884 14.1%
5 0.850 to 0.899 1417249.0 12.643090 6.321545 12.6%
6 0.900 to 0.949 1409668.0 12.575460 6.287730 12.6%
7 0.950 or more 202658.0 1.807885 0.903943 1.8%

Choropleth map

Creating the choropleth map is the easiest part:

  • load the YlGnBu colormap and specify that we want a continuous colormap
  • create a matplotlib figure with plt.subplots().
  • remove the axis
  • use the plot() method on the df geodataframe and specify the column to represent for the colors (here it's HDI)
palette_name = "YlGnBu"
cmap = load_cmap(palette_name, cmap_type="continuous")

fig, ax = plt.subplots(figsize=(10, 10), dpi=300)
ax.axis("off")
df.plot(ax=ax, column="HDI", cmap=cmap, edgecolor="lightgrey", linewidth=0.2, alpha=0.9)
plt.show()