Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Seaborn catplot combined with PairGrid

I am playing with the Titanic dataset, and trying to produce a pair plot of numeric variables against categorical variables. I can use Seaborn's catplot to graph a plot of one numeric variable against one categorical variable:

import seaborn as sns

sns.catplot(data=train, x='Fare', y='Sex')

However, if I try to use PairGrid to graph numeric variables against categorical variables:

x_vars = ['Fare']
y_vars = ['Sex']

g = sns.PairGrid(train, x_vars=x_vars, y_vars=y_vars)
g.map(sns.catplot)

It fails with an error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-75-c284a7cfd727> in <module>
      9 #g.map_diag(lambda x, **kwargs: sns.catplot(x, x, **kwargs), jitter=True, kind="bar")
     10 #g.map(sns.scatterplot, y_jitter=1)#, hue=train["Survived"])
---> 11 g.map(sns.catplot)#, hue=train["Survived"])

~/MLProject/book1/lib/python3.8/site-packages/seaborn/axisgrid.py in map(self, func, **kwargs)
   1363         row_indices, col_indices = np.indices(self.axes.shape)
   1364         indices = zip(row_indices.flat, col_indices.flat)
-> 1365         self._map_bivariate(func, indices, **kwargs)
   1366         return self
   1367 

~/MLProject/book1/lib/python3.8/site-packages/seaborn/axisgrid.py in _map_bivariate(self, func, indices, **kwargs)
   1504             y_var = self.y_vars[i]
   1505             ax = self.axes[i, j]
-> 1506             self._plot_bivariate(x_var, y_var, ax, func, kw_color, **kws)
   1507         self._add_axis_labels()
   1508 

~/MLProject/book1/lib/python3.8/site-packages/seaborn/axisgrid.py in _plot_bivariate(self, x_var, y_var, ax, func, kw_color, **kwargs)
   1534             color = self.palette[k] if kw_color is None else kw_color
   1535 
-> 1536             func(x, y, label=label_k, color=color, **kwargs)
   1537 
   1538         self._clean_axis(ax)

~/MLProject/book1/lib/python3.8/site-packages/seaborn/categorical.py in catplot(x, y, hue, data, row, col, col_wrap, estimator, ci, n_boot, units, seed, order, hue_order, row_order, col_order, kind, height, aspect, orient, color, palette, legend, legend_out, sharex, sharey, margin_titles, facet_kws, **kwargs)
   3760 
   3761     # Initialize the facets
-> 3762     g = FacetGrid(**facet_kws)
   3763 
   3764     # Draw the plot onto the facets

~/MLProject/book1/lib/python3.8/site-packages/seaborn/axisgrid.py in __init__(self, data, row, col, hue, col_wrap, sharex, sharey, height, aspect, palette, row_order, col_order, hue_order, hue_kws, dropna, legend_out, despine, margin_titles, xlim, ylim, subplot_kws, gridspec_kws, size)
    268         # Make a boolean mask that is True anywhere there is an NA
    269         # value in one of the faceting variables, but only if dropna is True
--> 270         none_na = np.zeros(len(data), np.bool)
    271         if dropna:
    272             row_na = none_na if row is None else data[row].isnull()

TypeError: object of type 'NoneType' has no len()

If I replace g.map(sns.catplot) with g.map(sns.scatterplot) it does graph successfully without error.

How can I combine catplot with PairGrid?

like image 941
wingedsubmariner Avatar asked Jan 26 '20 20:01

wingedsubmariner


People also ask

What is Seaborn pairgrid?

seaborn.PairGrid ¶ class seaborn.PairGrid (**kwargs) ¶ Subplot grid for plotting pairwise relationships in a dataset. This object maps each variable in a dataset onto a column and row in a grid of multiple axes.

What is a subplot grid?

Subplot grid for plotting pairwise relationships in a dataset. This object maps each variable in a dataset onto a column and row in a grid of multiple axes. Different axes-level plotting functions can be used to draw bivariate plots in the upper and lower triangles, and the the marginal distribution of each variable can be shown on the diagonal.

What is the learning curve like for Seaborn?

As Seaborn compliments and extends Matplotlib, the learning curve is quite gradual. If you know Matplotlib, you are already half way through Seaborn. Subplot grid for plotting pairwise relationships in a dataset. This class maps each variable in a dataset onto a column and row in a grid of multiple axes.

How to create a twin y-axis in Matplotlib?

That line of code is - The twinx () function is a function in the axes module of matplotlib library. It is used to create a twin y-axis that will share the x-axis with the original y-axis. This new y-axis will be on the right side of the chart.


1 Answers

@ImportanceOfBeingErnest already gave the right answer in a comment above: It does not make sense to combine sns.catplot() with a separately created FacetGrid, because sns.catplot() creates its own FacetGrid when called.

In any case, sns.catplot() invokes another seaborn function to do the actual plotting in each cell of the grid. That function can be selected by specifying the kind keyword argument to sns.catplot(). The default is kind="strip".

So if you want to manually create a FacetGrid and then map sns.catplot() to it, but without specifying the kind, you may as well use sns.stripplot() instead. This does work, but the Titanic dataset is too large for stripplots to be very informative, so I would use violin plots instead:

import seaborn as sns
sns.set()

titanic = sns.load_dataset('titanic')

num_vars = ['age', 'fare']
cat_vars = ['pclass', 'embarked', 'sex']

g = sns.PairGrid(data=titanic, x_vars=cat_vars, y_vars=num_vars)
g.map(sns.violinplot)

facet violin plots

See the sns.catplot() documentation for more details.


Follow-up question by @Bjarne Thorsted: How to replace the boxplots inside the violinplots with swarmplots?

You can still use the same approach, just in this case calling g.map() twice. To change the default parameters of the plotting functions to be passed to g.map(), you can define modified versions of those functions using the * and ** operators:

import seaborn as sns
sns.set()

titanic = sns.load_dataset('titanic')

num_vars = ['age', 'fare']
cat_vars = ['pclass', 'embarked', 'sex']

def violin_empty(*args, **kwargs):
    kwargs['color'] = 'lightblue'
    return sns.violinplot(*args, **kwargs, inner=None)

def swarm_small(*args, **kwargs):
    kwargs['color'] = 'black'
    return sns.swarmplot(*args, **kwargs, size=1) 

g = sns.PairGrid(data=titanic, x_vars=cat_vars, y_vars=num_vars)
g.map(violin_empty)
g.map(swarm_small)

violinplots with swarmplots inside

like image 181
Arne Avatar answered Oct 25 '22 21:10

Arne