Wordcloud

logo of a chart:Wordcloud

A word cloud (also called tag cloud or weighted list) is a visual representation of text data. Words are usually single words, and the importance of each is shown with font size or color. Python fortunately has a wordcloud library allowing to build them.

⏱ Quick start

# Libraries
from wordcloud import WordCloud
import matplotlib.pyplot as plt

# Create a list of word
text=("Python Python Python Matplotlib")

# Create the wordcloud object
wordcloud = WordCloud(width=480, height=480, margin=0).generate(text)

# Display the generated image:
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.margins(x=0, y=0)
plt.show()

⚠️ The issue with wordclouds

Wordclouds are aesthetically pleasing and people are used to it, what make sure readers will understand them quick.

However, it is important to consider the caveats associated to them. For instance,longer words will take more space on the figure by construction which distorts reality. Moreover, it is impossible to translate a font size to an accurate value.

Wordclouds with... the wordcloud library 😀

Thanks to the wordcloud library, we have a Wordcloud() function. We just have to pass a large string of text to it, and it will generate a wordcloud for us.

Then, we just have to call the imshow() function from matplotlib to display the wordcloud.

Wordclouds and custom shapes

It is a common need to apply a specific shape to the wordcloud. It's an excellent way to make the wordcloud more relevant to the data you are displaying. The wordcloud library allows you to do it by using a mask, and it's quite easy to do!

You can find the official documentation here and some examples of how to use it in practice below.

Contact


👋 This document is a work by Yan Holtz. You can contribute on github, send me a feedback on twitter or subscribe to the newsletter to know when new examples are published! 🔥