Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: transforming the DataFrameGroupBy object to desired format

I have a data frame as follows:

import pandas as pd
import numpy as np
df = pd.DataFrame({'id' : range(1,9),
                   'code' : ['one', 'one', 'two', 'three',
                             'two', 'three', 'one', 'two'],
                   'colour': ['black', 'white','white','white',
                           'black', 'black', 'white', 'white'],
                   'amount' : np.random.randn(8)},  columns= ['id','code','colour','amount'])

I want to be able to group the ids by code and colour and then sort them with respect to amount. I know how to groupby():

df.groupby(['code','colour']).head(5)
                id   code colour    amount
code  colour                              
one   black  0   1    one  black -0.117307
      white  1   2    one  white  1.653216
             6   7    one  white  0.817205
three black  5   6  three  black  0.567162
      white  3   4  three  white  0.579074
two   black  4   5    two  black -1.683988
      white  2   3    two  white -0.457722
             7   8    two  white -1.277020

However, my desired output is as below, where I have two columns: 1.code/colourcontains the key strings and 2.id:amount contains id - amount tuples sorted in descending order wrt amount:

code/colour  id:amount
one/black    {1:-0.117307}
one/white    {2:1.653216, 7:0.817205}
three/black  {6:0.567162}
three/white  {4:0.579074}
two/black    {5:-1.683988}
two/white    {3:-0.457722, 8:-1.277020}

How can I transform the DataFrameGroupBy object displayed above to my desired format? Or, shall I not use groupby() in the first place?

EDIT: Although not in the specified format, the code below kind of gives me the functionality I want:

groups = dict(list(df.groupby(['code','colour'])))
groups['one','white']
   id code colour    amount
1   2  one  white  1.331766
6   7  one  white  0.808739

How can I reduce the groups to only include the id and amount column?

like image 415
Zhubarb Avatar asked Jan 14 '14 10:01

Zhubarb


People also ask

What does Groupby transform do?

groupby. DataFrameGroupBy. transform. Call function producing a like-indexed DataFrame on each group and return a DataFrame having the same indexes as the original object filled with the transformed values.

How do you transform in pandas?

Pandas Series: transform() functionThe transform() function is used to call function on self producing a Series with transformed values and that has the same axis length as self. Function to use for transforming the data. If a function, must either work when passed a Series or when passed to Series. apply.

How do you get Groupby rows in pandas?

You can group DataFrame rows into a list by using pandas. DataFrame. groupby() function on the column of interest, select the column you want as a list from group and then use Series. apply(list) to get the list for every group.


1 Answers

First, groupby code and colour and then apply a customized function to format id and amount:

df = df.groupby(['code', 'colour']).apply(lambda x:x.set_index('id').to_dict('dict')['amount'])

And then modify the index:

df.index = ['/'.join(i) for i in df.index]

It will return a series, you can convert it back to DataFrame by:

df = df.reset_index()

Finally, add the column names by:

df.columns=['code/colour','id:amount']

Result:

In [105]: df
Out[105]: 
   code/colour                               id:amount
0    one/black                     {1: 0.392264412544}
1    one/white  {2: 2.13950686015, 7: -0.393002947047}
2  three/black                      {6: -2.0766612539}
3  three/white                     {4: -1.18058561325}
4    two/black                     {5: -1.51959565941}
5    two/white  {8: -1.7659863039, 3: -0.595666853895}
like image 56
waitingkuo Avatar answered Sep 18 '22 04:09

waitingkuo