When working with data visualizations in Python, we may want to split the data up by categories or different groups. This is achieved using Seaborn's FacetGrid function. This allows us to create a grid, on which we can map many other types of plots with minimal code, especially compared to using matplotlib.
In Seaborn, we will plot multiple graphs in a single window in two ways. First with the help of Facetgrid() function and other by implicit with the help of matplotlib. data: Tidy dataframe where each column is a variable and each row is an observation.
FacetGrid object takes a dataframe as input and the names of the variables that will form the row, column, or hue dimensions of the grid. The variables should be categorical and the data at each level of the variable will be used for a facet along that axis.
I want to plot error bars from a column in a pandas dataframe on a Seaborn FacetGrid
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar']*2,
'B' : ['one', 'one', 'two', 'three',
'two', 'two', 'one', 'three'],
'C' : np.random.randn(8),
'D' : np.random.randn(8)})
df
Example dataframe
A B C D
0 foo one 0.445827 -0.311863
1 bar one 0.862154 -0.229065
2 foo two 0.290981 -0.835301
3 bar three 0.995732 0.356807
4 foo two 0.029311 0.631812
5 bar two 0.023164 -0.468248
6 foo one -1.568248 2.508461
7 bar three -0.407807 0.319404
This code works for fixed size error bars:
g = sns.FacetGrid(df, col="A", hue="B", size =5)
g.map(plt.errorbar, "C", "D",yerr=0.5, fmt='o');
But I can't get it to work using values from the dataframe
df['E'] = abs(df['D']*0.5)
g = sns.FacetGrid(df, col="A", hue="B", size =5)
g.map(plt.errorbar, "C", "D", yerr=df['E']);
or
g = sns.FacetGrid(df, col="A", hue="B", size =5)
g.map(plt.errorbar, "C", "D", yerr='E');
both produce screeds of errors
EDIT:
After lots of matplotlib doc reading, and assorted stackoverflow answers, here is a pure matplotlib solution
#define a color palette index based on column 'B'
df['cind'] = pd.Categorical(df['B']).labels
#how many categories in column 'A'
cats = df['A'].unique()
cats.sort()
#get the seaborn colour palette and convert to array
cp = sns.color_palette()
cpa = np.array(cp)
#draw a subplot for each category in column "A"
fig, axs = plt.subplots(nrows=1, ncols=len(cats), sharey=True)
for i,ax in enumerate(axs):
df_sub = df[df['A'] == cats[i]]
col = cpa[df_sub['cind']]
ax.scatter(df_sub['C'], df_sub['D'], c=col)
eb = ax.errorbar(df_sub['C'], df_sub['D'], yerr=df_sub['E'], fmt=None)
a, (b, c), (d,) = eb.lines
d.set_color(col)
Other than the labels, and axis limits its OK. Its plotted a separate subplot for each category in column 'A', colored by the category in column 'B'. (Note the random data is different to that above)
I'd still like a pandas/seaborn solution if anyone has any ideas?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With