I spent a few hours searching for an answer, but I can't seem to get one.
Long story short, I have a dataframe. The following code will produce the dataframe in question (albeit anonymised with random numbers):
variable1 = ["Attribute 1","Attribute 1","Attribute 1","Attribute 1","Attribute 1","Attribute 1","Attribute 2","Attribute 2",
"Attribute 2","Attribute 2","Attribute 2","Attribute 2","Attribute 3","Attribute 3","Attribute 3","Attribute 3",
"Attribute 3","Attribute 3","Attribute 4","Attribute 4","Attribute 4","Attribute 4","Attribute 4","Attribute 4",
"Attribute 5","Attribute 5","Attribute 5","Attribute 5","Attribute 5","Attribute 5"]
variable2 = ["Property1","Property2","Property3","Property4","Property5","Property6","Property1","Property2","Property3",
"Property4","Property5","Property6","Property1","Property2","Property3",
"Property4","Property5","Property6","Property1","Property2","Property3","Property4",
"Property5","Property6","Property1","Property2","Property3","Property4","Property5","Property6"]
number = [93,224,192,253,186,266,296,100,135,169,373,108,211,194,164,375,211,71,120,334,59,164,348,50,249,18,251,343,172,41]
bar = pd.DataFrame({"variable1":variable1, "variable2":variable2, "number":number})
bar_grouped = bar.groupby(["variable1","variable2"]).sum()
The outcome should look like:
And the second one:
I have been trying to plot them with a bar chart and having the Properties as the groups and the different Attributes as the bars. Similar to this (plotted in Excel manually though). I would prefer to do it in the grouped datafarme, as to be able to plot with different groupings without the need to reset the index each time.
I hope this is clear.
Any help on this is hugely appreciated.
Thanks! :)
I wouldn't bother creating your groupby
result (since you aren't aggregating anything). This is a pivot
bar.pivot('variable2', 'variable1', 'number').plot(kind='bar')
plt.tight_layout()
plt.show()
If aggregation is required, you can still start with your bar
and use pivot_table
bar.pivot_table(index='variable2', columns='variable1', values='number', aggfunc='sum')
Use unstack
first:
bar_grouped['number'].unstack(0).plot(kind='bar')
[out]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With