A marginal plot allows to study the relationship between 2 numeric variables. The central chart display their correlation. It is usually a scatterplot, a hexbin plot, a 2D histogram or a 2D density plot. The marginal charts, usually at the top and at the right, show the distribution of the 2 variables using histogram or density plot.
The seaborn library provides a joint plot function that is really handy to make this type of graphic. The top graph shows it default behaviour, and here are a few possible customizations. Seaborn has a nice documentation and some of these examples come from there.
-
- #82 Default Marginal plot
- #82 Hexbin with marginal plot
- #82 2D contour with marginal plots
Three main options are available for the central part: scatterplot (with possible variations), hexbin or 2D density plot. Here is the way to make them:
# library & dataset import seaborn as sns df = sns.load_dataset('iris') # Custom the inside plot: options are: “scatter” | “reg” | “resid” | “kde” | “hex” sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='scatter') sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='hex') sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='kde')
- #82 Custom marginal plot
- #82 Custom color of marginal plot
Note that you can custom these central graphics:
# Then you can pass arguments to each type: sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='scatter', s=200, color='m', edgecolor="skyblue", linewidth=2) # Custom the color sns.set(style="white", color_codes=True) sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='kde', color="skyblue")
-
You can easily custom the marginal plots using the marginal_kws argument. Do not hesitate to visit the histogram and density section of the gallery to see which customisation are available.
# library & dataset import seaborn as sns df = sns.load_dataset('iris') # Custom the histogram and add rug: sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='hex', marginal_kws=dict(bins=30, rug=True))
-
- #82 Custom space in marginal plot
- #82 Custom space in marginal plot
- #82 Custom ratio in marginal plot
You can control the space between main and marginal plots, and the size ratio between these 2 parts:
# library & dataset import seaborn as sns df = sns.load_dataset('iris') # No space sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='kde', color="grey", space=0) # Huge space sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='kde', color="grey", space=3) # Make marginal bigger: sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='kde',ratio=1)
Pingback: Dataset Analysis: explorando el dolor lumbar - Sitio Big Data