Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add error bars on a grouped barplot from a pandas column

I have a pandas data frame df that has four columns: Candidate, Sample_Set, Values, and Error. The Candidate column has, say, three unique entries: [X, Y, Z] and we have three sample sets, such that Sample_Set has three unique values as well: [1,2,3]. The df would roughly look like this.

import pandas as pd

data = {'Candidate': ['X', 'Y', 'Z', 'X', 'Y', 'Z', 'X', 'Y', 'Z'],
        'Sample_Set': [1, 1, 1, 2, 2, 2, 3, 3, 3],
        'Values': [20, 10, 10, 200, 101, 99, 1999, 998, 1003],
        'Error': [5, 2, 3, 30, 30, 30, 10, 10, 10]}
df = pd.DataFrame(data)

# display(df)
  Candidate  Sample_Set  Values  Error
0         X           1      20      5
1         Y           1      10      2
2         Z           1      10      3
3         X           2     200     30
4         Y           2     101     30
5         Z           2      99     30
6         X           3    1999     10
7         Y           3     998     10
8         Z           3    1003     10

I am using seaborn to create a grouped barplot out of this with x="Candidate", y="Values", hue="Sample_Set". All's good, until I try to add an error bar along the y-axis using the values under the column named Error. I am using the following code.

import seaborn as sns

ax = sns.factorplot(x="Candidate", y="Values", hue="Sample_Set", data=df,
                    size=8, kind="bar")

How do I incorporate the error?

I would appreciate a solution or a more elegant approach on the task.

like image 459
EFL Avatar asked Nov 04 '25 20:11

EFL


1 Answers

As @ResMar pointed out in the comments, there seems to be no built-in functionality in seaborn to easily set individual errorbars.

If you rather care about the result than the way to get there, the following (not so elegant) solution might be helpful, which builds on matplotlib.pyplot.bar. The seaborn import is just used to get the same style.

import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd

def grouped_barplot(df, cat,subcat, val , err):
    u = df[cat].unique()
    x = np.arange(len(u))
    subx = df[subcat].unique()
    offsets = (np.arange(len(subx))-np.arange(len(subx)).mean())/(len(subx)+1.)
    width= np.diff(offsets).mean()
    for i,gr in enumerate(subx):
        dfg = df[df[subcat] == gr]
        plt.bar(x+offsets[i], dfg[val].values, width=width, 
                label="{} {}".format(subcat, gr), yerr=dfg[err].values)
    plt.xlabel(cat)
    plt.ylabel(val)
    plt.xticks(x, u)
    plt.legend()
    plt.show()


cat = "Candidate"
subcat = "Sample_Set"
val = "Values"
err = "Error"

# call the function with df from the question
grouped_barplot(df, cat, subcat, val, err )

enter image description here

Note that by simply inversing the category and subcategory

cat = "Sample_Set"
subcat = "Candidate"

you can get a different grouping:

enter image description here

like image 122
ImportanceOfBeingErnest Avatar answered Nov 07 '25 10:11

ImportanceOfBeingErnest



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!