I have a large dataframe which I would like to group by some column and examine graphically the distribution per group using a boxplot. I found that df.boxplot()
will do it for each column of the dataframe and put it in one plot, just as I need.
The problem is that after a groupby operation, my data is all in one column with the group labels in the index , so i can't call boxplot on the result.
here is an example:
df = DataFrame({'a':rand(10),'b':[x%2 for x in range(10)]})
df
a b
0 0.273548 0
1 0.378765 1
2 0.190848 0
3 0.646606 1
4 0.562591 0
5 0.409250 1
6 0.637074 0
7 0.946864 1
8 0.203656 0
9 0.276929 1
Now I want to group by column b and boxplot the distribution of both groups in one boxplot. How can I do that?
You can use the by
argument of boxplot
. Is that what you are looking for?
df.boxplot(column='a', by='b')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With