Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reordering columns/rows of a pivot_table?

pandas's pivot_table seems to return columns only in alphabetical order, such that

pivot_table(tips, 'tip_pct', rows=['sex', 'day'], cols='smoker', aggfunc=len)

gives:

  smoker     No  Yes
sex    day          
Female Fri   2   7  
       Sat   13  15
       Sun   14  4  
       Thur  25  7  
Male   Fri   2   8  
       Sat   32  27
       Sun   43  15
       Thur  20  10 

If I wanted to put Thur above Fri, and Yes to the left of No, how would I go about it?

like image 759
Jonathan Harford Avatar asked Aug 12 '13 14:08

Jonathan Harford


People also ask

How do I rearrange the order of rows in a PivotTable?

Change the order of row or column items In the PivotTable, right-click the row or column label or the item in a label, point to Move, and then use one of the commands on the Move menu to move the item to another location.


2 Answers

Using Categories, introduced in pandas 0.15, the 'day' and 'smoker' columns can be converted to categories with predefined order. The pivot_table() would keep them sorted.

>>> pt = pd.pivot_table(df, 'tip_pct', index=['sex', 'day'], columns='smoker', aggfunc=pd.np.sum)

smoker      No  Yes
sex    day         
Female Fri   0    4
       Sat   0    5
       Sun   0    5
       Thu   9    3
Male   Fri   0    4
       Sat   1    5
       Sun   1    5
       Thu   9    3

>>> df["day"] = df["day"].astype('category', categories=["Thu", "Fri", "Sat", "Sun"])
>>> df["smoker"] = df["smoker"].astype('category', categories = ["Yes", "No"])
>>> pt = pd.pivot_table(df, 'tip_pct', index=['sex', 'day'], columns='smoker', aggfunc=pd.np.sum)

smoker      Yes  No
sex    day         
Female Thu    3   9
       Fri    4   0
       Sat    5   0
       Sun    5   0
Male   Thu    3   9
       Fri    4   0
       Sat    5   1
       Sun    5   1
like image 52
Rahul Sekar Avatar answered Sep 23 '22 13:09

Rahul Sekar


Assign dummy values to these variables and then sort the frame on the axes with those values. It's definitely a hack.

In [47]: list_of_days = ['Mon', 'Tues', 'Wed', 'Thur', 'Fri', 'Sat', 'Sun']

In [48]: ri = df.reset_index()

In [49]: day = np.unique(ri.day)

In [50]: day_index = [list_of_days.index(d) for d in day]

In [51]: ri['day_index'] = day_index

In [52]: ri
Out[52]:
   day_index     sex   day  no  yes
0          4  female   Fri   1   47
1          5  female   Sat  42   16
2          6  female   Sun  15   48
3          3  female  Thur  15   49
4          4    male   Fri  48   42
5          5    male   Sat  41   14
6          6    male   Sun   6   36
7          3    male  Thur   9   20

In [53]: ri.sort(['sex', 'day_index', 'day']).set_index(['sex', 'day']).drop('day_index', axis=1)
Out[53]:
             no  yes
sex    day
female Thur  15   49
       Fri    1   47
       Sat   42   16
       Sun   15   48
male   Thur   9   20
       Fri   48   42
       Sat   41   14
       Sun    6   36

I've only showed the solution for the index but for the columns you can do something similar.

like image 25
Phillip Cloud Avatar answered Sep 22 '22 13:09

Phillip Cloud