How to add error bars on a grouped barplot from a pandas column

Question

I have a pandas data frame df that has four columns: Candidate, Sample_Set, Values, and Error. The Candidate column has, say, three unique entries: [X, Y, Z] and we have three sample sets, such that Sample_Set has three unique values as well: [1,2,3]. The df would roughly look like this.

import pandas as pd

data = {'Candidate': ['X', 'Y', 'Z', 'X', 'Y', 'Z', 'X', 'Y', 'Z'],
        'Sample_Set': [1, 1, 1, 2, 2, 2, 3, 3, 3],
        'Values': [20, 10, 10, 200, 101, 99, 1999, 998, 1003],
        'Error': [5, 2, 3, 30, 30, 30, 10, 10, 10]}
df = pd.DataFrame(data)

# display(df)
  Candidate  Sample_Set  Values  Error
0         X           1      20      5
1         Y           1      10      2
2         Z           1      10      3
3         X           2     200     30
4         Y           2     101     30
5         Z           2      99     30
6         X           3    1999     10
7         Y           3     998     10
8         Z           3    1003     10

I am using seaborn to create a grouped barplot out of this with x="Candidate", y="Values", hue="Sample_Set". All's good, until I try to add an error bar along the y-axis using the values under the column named Error. I am using the following code.

import seaborn as sns

ax = sns.factorplot(x="Candidate", y="Values", hue="Sample_Set", data=df,
                    size=8, kind="bar")

How do I incorporate the error?

I would appreciate a solution or a more elegant approach on the task.

ImportanceOfBeingErnest · Accepted Answer

As @ResMar pointed out in the comments, there seems to be no built-in functionality in seaborn to easily set individual errorbars.

If you rather care about the result than the way to get there, the following (not so elegant) solution might be helpful, which builds on matplotlib.pyplot.bar. The seaborn import is just used to get the same style.

import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd

def grouped_barplot(df, cat,subcat, val , err):
    u = df[cat].unique()
    x = np.arange(len(u))
    subx = df[subcat].unique()
    offsets = (np.arange(len(subx))-np.arange(len(subx)).mean())/(len(subx)+1.)
    width= np.diff(offsets).mean()
    for i,gr in enumerate(subx):
        dfg = df[df[subcat] == gr]
        plt.bar(x+offsets[i], dfg[val].values, width=width, 
                label="{} {}".format(subcat, gr), yerr=dfg[err].values)
    plt.xlabel(cat)
    plt.ylabel(val)
    plt.xticks(x, u)
    plt.legend()
    plt.show()


cat = "Candidate"
subcat = "Sample_Set"
val = "Values"
err = "Error"

# call the function with df from the question
grouped_barplot(df, cat, subcat, val, err )

enter image description here

Note that by simply inversing the category and subcategory

cat = "Sample_Set"
subcat = "Candidate"

you can get a different grouping:

enter image description here

How to add error bars on a grouped barplot from a pandas column

Tags:

python

pandas

matplotlib

seaborn

bar-chart

EFL

1 Answers

ImportanceOfBeingErnest

Recent Activity

Donate For Us

How to add error bars on a grouped barplot from a pandas column

Tags:

python

pandas

matplotlib

seaborn

bar-chart

EFL

1 Answers

ImportanceOfBeingErnest

Related questions

Recent Activity

Donate For Us