I'm trying to produce a bar chart with all the observations in my DataFrame
, which looks like this: Dataframe (rows = years, columns = objects, values = violations of object in year)
I'm getting the right type of graph when using the default pandas plot
:
cluster_yearly_results_df.plot.bar()
Correct Bar Plot
However, I would like to use seaborn, and I am having trouble inputting wide-form dataframes, using:
sns.barplot(data=cluster_yearly_results_df)
Can I use seaborn for what I want to do?
The seaborn.barplot
docs say:
A bar plot represents an estimate of central tendency for a numeric variable with the height of each rectangle and provides some indication of the uncertainty around that estimate using error bars.
In other words, the purpose is to represent multiple values for a one variable with a single bar that represents the mean
, and error bars for std
. You are looking to represent individual values with bars as the pandas.plot.bar()
does.
Having said this, you can tweak your DataFrame
as below to match the seaborn
interface. Starting with a DataFrame
similar to yours:
df = pd.DataFrame(np.random.randint(low=0, high=10, size=(10, 3)), columns=list('ABC'))
A B C
0 7 6 4
1 3 5 9
2 3 0 5
3 0 1 3
4 9 7 7
Use .stack()
and .reset_index()
to create two columns that uniquely identify each value in y
:
df = df.stack().reset_index()
df.columns = ['x', 'hue', 'y']
which yields:
x hue y
0 0 A 6
1 0 B 1
2 0 C 2
3 1 A 5
4 1 B 7
then plot:
sns.barplot(y='y', x='x', hue='hue', data=df)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With