Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas DataFrame : groupby then transpose

Tags:

python

pandas

I have this DataFrame :

df = pd.DataFrame({'1-sensor':['608', '608', '2158', '2158'],
          '2-day':['2017-12-11', '2017-12-12', '2017-12-11', '2017-12-12'],
          '3-voltage':[30, 31, 28, 29]})

df :

  1-sensor       2-day  3-voltage
0      608  2017-12-11         30
1      608  2017-12-12         31
2     2158  2017-12-11         28
3     2158  2017-12-12         29

And I would like to make it look like this :

pd.DataFrame({'1-sensor':['608', '2158'],
              '2017-12-11':[30, 28],
              '2017-12-12':[31, 29]})

df :

  1-sensor  2017-12-11  2017-12-12
0      608          30          31
1     2158          28          29

I have tried different solutions combining .groupby('sensor') and .unstack() but not successful. Any good suggestions ?

Thanks!

like image 779
Jusstol Avatar asked Dec 19 '17 16:12

Jusstol


2 Answers

Option 1
Use pivot, if there are no aggregations. This is actually more efficient -

df.pivot(index='1-sensor', columns='2-day', values='3-voltage')

2-day     2017-12-11  2017-12-12
1-sensor                        
2158              28          29
608               30          31

If there are any aggregations to be performed, use pivot_table and include aggfunc=....


Option 2
Another solution with set_index + unstack -

df.set_index(['1-sensor', '2-day']).unstack()

          3-voltage           
2-day    2017-12-11 2017-12-12
1-sensor                      
2158             28         29
608              30         31

If aggregations are required, then groupby + unstack will do it. You were this close.

df.groupby(['1-sensor', '2-day'], sort=False).sum().unstack()  

          3-voltage           
2-day    2017-12-11 2017-12-12
1-sensor                      
608              30         31
2158             28         29

Replace sum with the aggfunction you want to use, if you need to use one.

like image 139
cs95 Avatar answered Sep 16 '22 14:09

cs95


You can also do it with pd.pivot_table:

pd.pivot_table(df, index = ['1-sensor'], columns = ['2-day'], values = '3-voltage').reset_index()
like image 39
Jacob H Avatar answered Sep 17 '22 14:09

Jacob H