Sankey Diagram with Python and Plotly

logo of a chart:Sankey

This blogpost describes how to build a Sankey Diagram with Python and the Plotly library. It starts with basic examples based on various input formats and then explain how to apply the most common customisations.

The Data

What kind of data do we need

source = [0, 0, 1, 1, 0]
target = [2, 3, 4, 5, 4]
value = [8, 2, 2, 8, 4]

Most basic Sankey diagram with Plotly

Once that we have the data, the first step is to load the library

import plotly.graph_objects as go

Now let's use it to plot the Sanke

link = dict(source = source, target = target, value = value)
data = go.Sankey(link = link)

fig = go.Figure(data)

fig.show()

Note that you can save this chart to a standalone file thanks to the write.html() function. It is then possible to render it in any html document using an iframe

# If you need to save this file as a standalong html file:
fig.write_html("../../static/interactiveCharts/sankey-diagram-plotly1.html")
%%html
<iframe src="../../interactiveCharts/sankey-diagram-plotly1.html" width="800" height="600" title="ridgeline chart with plotly" style="border:none"></iframe>

Basic customization

This section needs to be edited. Sorry for the inconvenience 😔

Real life Sankey diagram

This is the example coming from the plotly documentation, in order to see how plotly behaves with a real life dataset!

import plotly.graph_objects as go
import urllib, json

url = 'https://raw.githubusercontent.com/plotly/plotly.js/master/test/image/mocks/sankey_energy.json'
response = urllib.request.urlopen(url)
data = json.loads(response.read())

# override gray link colors with 'source' colors
opacity = 0.4
# change 'magenta' to its 'rgba' value to add opacity
data['data'][0]['node']['color'] = ['rgba(255,0,255, 0.8)' if color == "magenta" else color for color in data['data'][0]['node']['color']]
data['data'][0]['link']['color'] = [data['data'][0]['node']['color'][src].replace("0.8", str(opacity))
                                    for src in data['data'][0]['link']['source']]

fig = go.Figure(data=[go.Sankey(
    valueformat = ".0f",
    valuesuffix = "TWh",
    # Define nodes
    node = dict(
      pad = 15,
      thickness = 15,
      line = dict(color = "black", width = 0.5),
      label =  data['data'][0]['node']['label'],
      color =  data['data'][0]['node']['color']
    ),
    # Add links
    link = dict(
      source =  data['data'][0]['link']['source'],
      target =  data['data'][0]['link']['target'],
      value =  data['data'][0]['link']['value'],
      label =  data['data'][0]['link']['label'],
      color =  data['data'][0]['link']['color']
))])

fig.update_layout(title_text="Energy forecast for 2050<br>Source: Department of Energy & Climate Change, Tom Counsell via <a href='https://bost.ocks.org/mike/sankey/'>Mike Bostock</a>",
                  font_size=10)

Save the figure in html format

# If you need to save this file as a standalong html file:
fig.write_html("../../static/interactiveCharts/sankey-diagram-plotly-real.html")

Render it as an iframe

%%html
<iframe src="../../interactiveCharts/sankey-diagram-plotly-real.html" width="800" height="900" title="ridgeline chart with plotly" style="border:none"></iframe>

Contact & Edit


👋 This document is a work by Yan Holtz. You can contribute on github, send me a feedback on twitter or subscribe to the newsletter to know when new examples are published! 🔥

This page is just a jupyter notebook, you can edit it here. Please help me making this website better 🙏!