Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scatterplot with point colors representing a continuous variable in seaborn FacetGrid

I am trying to generate multi-panel figure using seaborn in python and I want the color of the points in my multi-panel figure to be specified by a continuous variable. Here's an example of what I am trying to do with the "iris" dataset:

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib as mpl
import matplotlib.pyplot as plt
iris = sns.load_dataset('iris')

g = sns.FacetGrid(iris, col = 'species', hue = 'petal_length', palette = 'seismic')
g = g.map(plt.scatter, 'sepal_length', 'sepal_width', s = 100, alpha = 0.5)
g.add_legend()

This makes the following figure: iris_continuous

Which is nice, but the legend is way too long. I'd like to sample out like 1/4 of these values (ideally) or barring that display a colorbar instead. For instance, something like this might be acceptable, but I'd still want to split it over the three species.

plt.scatter(iris.sepal_length, iris.sepal_width, alpha = .8, c = iris.petal_length, cmap = 'seismic')
cbar = plt.colorbar()

one panel

Any idea about how I can get the best of both of these plots?

Edit: This topic seems like a good start.

https://github.com/mwaskom/seaborn/issues/582

Somehow, for this user, simply appending plt.colorbar after everything else ran seemed to somehow work. Doesn't seem to help in this case though.

like image 753
ohnoplus Avatar asked Jun 20 '17 00:06

ohnoplus


People also ask

How do I change the color of a scatter plot in Seaborn?

Scatterplot with Seaborn Default Colors In addition to these arguments we can use hue and specify we want to color the data points based on another grouping variable. This will produce points with different colors. g =sns. scatterplot(x="gdpPercap", y="lifeExp", hue="continent", data=gapminder); g.

What is FacetGrid in Seaborn?

The methods we are going to use are will plot on Seaborn's FaceGrid. A FacetGrid is a multi-axes grid with subplots visualizing the distribution of variables of a dataset and the relationship between multiple variables.


1 Answers

The FacetGrid hue is categorical, not continuous. It will require a little bit of work to get a continuous colormap for a scatterplot in the FacetGrid (unlike with imshow in the linked Github issue, matplotlib does not keep a reference to the "currently active scatterplot mapper" so that a magic call to plt.colorbar doesn't pick up the mapping applied to the point colors).

g = sns.FacetGrid(iris, col='species', palette = 'seismic')

def facet_scatter(x, y, c, **kwargs):
    """Draw scatterplot with point colors from a faceted DataFrame columns."""
    kwargs.pop("color")
    plt.scatter(x, y, c=c, **kwargs)

vmin, vmax = 0, 7
cmap = sns.diverging_palette(240, 10, l=65, center="dark", as_cmap=True)

g = g.map(facet_scatter, 'sepal_length', 'sepal_width', "petal_length",
          s=100, alpha=0.5, vmin=vmin, vmax=vmax, cmap=cmap)

# Make space for the colorbar
g.fig.subplots_adjust(right=.92)

# Define a new Axes where the colorbar will go
cax = g.fig.add_axes([.94, .25, .02, .6])

# Get a mappable object with the same colormap as the data
points = plt.scatter([], [], c=[], vmin=vmin, vmax=vmax, cmap=cmap)

# Draw the colorbar
g.fig.colorbar(points, cax=cax)

enter image description here

like image 118
mwaskom Avatar answered Oct 14 '22 23:10

mwaskom