#82 Marginal plot with Seaborn

 

 

 

A marginal plot allows to study the relationship between 2 numeric variables. The central chart display their correlation. It is usually a scatterplot, a hexbin plot, a 2D histogram or a 2D density plot. The marginal charts, usually at the top and at the right, show the distribution of the 2 variables using histogram or density plot.

The seaborn library provides a joint plot function that is really handy to make this type of graphic. The top graph shows it default behaviour, and here are a few possible customizations. Seaborn has a nice documentation and some of these examples come from there.

 

 

 

 

 

  • Three main options are available for the central part: scatterplot (with possible variations), hexbin or 2D density plot. Here is the way to make them:

    
    # library & dataset
    import seaborn as sns
    df = sns.load_dataset('iris')
    
    # Custom the inside plot: options are: “scatter” | “reg” | “resid” | “kde” | “hex”
    sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='scatter')
    sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='hex')
    sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='kde')
    
    

    Note that you can custom these central graphics:

    
    # Then you can pass arguments to each type:
    sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='scatter', s=200, color='m', edgecolor="skyblue", linewidth=2)
    
    # Custom the color
    sns.set(style="white", color_codes=True)
    sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='kde', color="skyblue")
    

     

  •  

     

     

     

     

    You can easily custom the marginal plots using the marginal_kws argument. Do not hesitate to visit the histogram and density section of the gallery to see which customisation are available.

     

     

     

     

     

    
    # library & dataset
    import seaborn as sns
    df = sns.load_dataset('iris')
    
    # Custom the histogram and add rug:
    sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='hex', marginal_kws=dict(bins=30, rug=True))
    
    
  • You can control the space between main and marginal plots, and the size ratio between these 2 parts:

    
    # library & dataset
    import seaborn as sns
    df = sns.load_dataset('iris')
    
    # No space
    sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='kde', color="grey", space=0)
    
    # Huge space
    sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='kde', color="grey", space=3)
    
    # Make marginal bigger:
    sns.jointplot(x=df["sepal_length"], y=df["sepal_width"], kind='kde',ratio=1)
    
    

 

Leave a Reply

Your email address will not be published.