About
This plot is a multiple line chart, generally used to show the evolution of a variable over time. In our case, it displays the evolution of unemployment rates in different regions in the world. Each line represents a region, and the y-axis represents the unemployment rate.
It has been originally designed by Joseph Barbier. Thanks to him for sharing his work!
As a teaser, here is the plot we’re gonna try building:
Libraries
For creating this chart, we will need a whole bunch of libraries!
- matplotlib: to customize the appearance of the chart
- pandas: to handle the data
- highlight_text for the annotations
# data manipulation
import pandas as pd
# create the charts
import matplotlib.pyplot as plt
# annotations
from highlight_text import ax_text, fig_text
# custom fonts
from matplotlib import font_manager
from matplotlib.font_manager import FontProperties
# arrows
from matplotlib.patches import FancyArrowPatch
Dataset
The data can be accessed using the url below.
The chart mainly relies on df
, but rent
and rent_words
are used for annotation purposes.
path = 'https://raw.githubusercontent.com/holtzy/The-Python-Graph-Gallery/master/static/data/economic_data.csv'
# open and clean dataset
df = pd.read_csv(path)
# convert to a date format
df['date'] = pd.to_datetime(df['date'])
# remove percentage sign and convert to float
col_to_update = ['unemployment rate', 'cpi yoy', 'core cpi', 'gdp yoy', 'interest rates']
for col in col_to_update:
df[col] = df[col].str.replace('%', '').astype(float)
# display first rows
df.head()
country | date | manufacturing pmi | services pmi | consumer confidence | interest rates | cpi yoy | core cpi | unemployment rate | gdp yoy | ticker | open | high | low | close | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | australia | 2020-01-01 | 49.6 | 50.6 | 93.4 | 0.75 | 2.2 | 1.7 | 5.2 | 1.2 | audusd | 0.7021 | 0.7031 | 0.6682 | 0.6691 |
1 | australia | 2020-02-01 | 50.2 | 49.0 | 95.5 | 0.75 | 2.2 | 1.7 | 5.1 | 1.2 | audusd | 0.6690 | 0.6776 | 0.6434 | 0.6509 |
2 | australia | 2020-03-01 | 49.7 | 38.5 | 91.9 | 0.50 | 2.2 | 1.7 | 5.2 | 1.2 | audusd | 0.6488 | 0.6686 | 0.5507 | 0.6135 |
3 | australia | 2020-04-01 | 44.1 | 19.5 | 75.6 | 0.25 | -0.3 | 1.2 | 6.3 | -6.1 | audusd | 0.6133 | 0.6571 | 0.5979 | 0.6510 |
4 | australia | 2020-05-01 | 44.0 | 26.9 | 88.1 | 0.25 | -0.3 | 1.2 | 7.0 | -6.1 | audusd | 0.6511 | 0.6684 | 0.6371 | 0.6666 |
Basic small multiples
Small multiples are a type of chart that shows the same type of information for different categories. In this case, we will create a small multiple of line charts, one for each category (country) in the dataset.
It's very important that the number of distinct values in the category
(country in this case) is the same as the number of subplots. Otherwise, the plot will not be created correctly. In our case we have 9 countries/regions, so we will create 9 subplots (3 rows and 3 columns).
# parameters
dpi = 150
category = 'country'
year = 'date'
value = 'unemployment rate'
countries = df.groupby(category)[value].max().sort_values(ascending=False).index.tolist()
fig, axs = plt.subplots(nrows=3, ncols=3, figsize=(12, 8), dpi=dpi)
for i, (group, ax) in enumerate(zip(countries, axs.flat)):
# filter main and other groups
filtered_df = df[df[category] == group]
other_groups = df[category].unique()[df[category].unique() != group]
# Plot other groups with lighter colors
for other_group in other_groups:
other_y = df[value][df[category] == other_group]
other_x = df[year][df[category] == other_group]
ax.plot(other_x, other_y, color='grey', alpha=0.2)
# Plot the main group
x = filtered_df[year]
y = filtered_df[value]
ax.plot(x, y, color='black')
plt.show()
Customize axis
A simple way to make a chart more appealing is to remove the axis. This can be done using the set_axis_off()
function.
We also specify the y axis limits by expanding the range a bit. This is done using the set_ylim()
function.
# parameters
dpi = 150
category = 'country'
year = 'date'
value = 'unemployment rate'
countries = df.groupby(category)[value].max().sort_values(ascending=False).index.tolist()
fig, axs = plt.subplots(nrows=3, ncols=3, figsize=(12, 8), dpi=dpi)
for i, (group, ax) in enumerate(zip(countries, axs.flat)):
# filter main and other groups
filtered_df = df[df[category] == group]
other_groups = df[category].unique()[df[category].unique() != group]
# Plot other groups with lighter colors
for other_group in other_groups:
other_y = df[value][df[category] == other_group]
other_x = df[year][df[category] == other_group]
ax.plot(other_x, other_y, color='grey', alpha=0.2)
# Plot the main group
x = filtered_df[year]
y = filtered_df[value]
ax.plot(x, y, color='black')
# Custom axes
ax.set_axis_off()
ax.set_ylim(df[value].min()-0.2, df[value].max()+0.3)
plt.show()
Custom colors
In order to have a color per group we need to define a list of colors of the same length as the number of groups. In our case, we manually define 9 different colors. The colors are then accessed using the colors[i]
syntax inside the for
loop.
We also change the background color of the plot to make it more appealing.
# parameters
dpi = 150
category = 'country'
year = 'date'
value = 'unemployment rate'
background_color = '#001219'
linewidth_main = 1.2
colors = [
'#bc6c25','#00b4d8','#d62828',
'#2a9d8f','#e29578','#9d4edd',
'#a3b18a','#ffe6a7','#78a1bb'
]
# custom order for the countries
countries = df.groupby(category)[value].max().sort_values(ascending=False).index.tolist()
fig, axs = plt.subplots(nrows=3, ncols=3, figsize=(12, 8), dpi=dpi)
fig.set_facecolor(background_color)
for i, (group, ax) in enumerate(zip(countries, axs.flat)):
# Set the background color
ax.set_facecolor(background_color)
# filter main and other groups
filtered_df = df[df[category] == group]
other_groups = df[category].unique()[df[category].unique() != group]
# Plot other groups with lighter colors
for other_group in other_groups:
other_y = df[value][df[category] == other_group]
other_x = df[year][df[category] == other_group]
ax.plot(other_x, other_y, color='grey', alpha=0.2, linewidth=linewidth_main)
# Plot the main group
x = filtered_df[year]
y = filtered_df[value]
ax.plot(x, y, color=colors[i], linewidth=linewidth_main, zorder=10)
# Custom axes
ax.set_axis_off()
ax.set_ylim(df[value].min()-0.2, df[value].max()+0.3)
plt.show()