Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boxplot with pandas and groupby

I have the following dataset sample:

     0         1
0    0  0.040158
1    2  0.500642
2    0  0.005694
3    1  0.065052
4    0  0.034789
5    2  0.128495
6    1  0.088816
7    1  0.056725
8    0 -0.000193
9    2 -0.070252
10   2  0.138282
11   2  0.054638
12   2  0.039994
13   2  0.060659
14   0  0.038562

And need a box and whisker plot, grouped by column 0. I have the following:

plt.figure()
grouped = df.groupby(0)
grouped.boxplot(column=1)
plt.savefig('plot.png')

But I end up with three subplots. How can place all three on one plot? Thanks. enter image description here

like image 287
nicholas.reichel Avatar asked Apr 25 '15 16:04

nicholas.reichel


People also ask

What is possible using groupby () method of Pandas?

Pandas groupby is used for grouping the data according to the categories and apply a function to the categories. It also helps to aggregate data efficiently. Pandas dataframe. groupby() function is used to split the data into groups based on some criteria.

Can I groupby an object in Pandas?

Pandas' groupby() allows us to split data into separate groups to perform computations for better analysis. In this article, you'll learn the “group by” process (split-apply-combine) and how to use Pandas's groupby() function to group data and perform operations.


2 Answers

In 0.16.0 version of pandas, you could simply do this:

df.boxplot(by='0')

Result:

enter image description here

like image 158
fixxxer Avatar answered Oct 14 '22 04:10

fixxxer


I don't believe you need to use groupby.

df2 = df.pivot(columns=df.columns[0], index=df.index)
df2.columns = df2.columns.droplevel()

>>> df2
0          0         1         2
0   0.040158       NaN       NaN
1        NaN       NaN  0.500642
2   0.005694       NaN       NaN
3        NaN  0.065052       NaN
4   0.034789       NaN       NaN
5        NaN       NaN  0.128495
6        NaN  0.088816       NaN
7        NaN  0.056725       NaN
8  -0.000193       NaN       NaN
9        NaN       NaN -0.070252
10       NaN       NaN  0.138282
11       NaN       NaN  0.054638
12       NaN       NaN  0.039994
13       NaN       NaN  0.060659
14  0.038562       NaN       NaN

df2.boxplot()

boxplot

like image 31
Alexander Avatar answered Oct 14 '22 06:10

Alexander