Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas: get all groupby values in an array [duplicate]

I'm sure this has been asked before, sorry if duplicate. Suppose I have the following dataframe:

df = pd.DataFrame({'key': ['A', 'B', 'C', 'A', 'B', 'C'],
                   'data': range(6)}, columns=['key', 'data'])

>>
    key data
0   A   0
1   B   1
2   C   2
3   A   3
4   B   4
5   C   5

Doing a groupby on 'key', df.groupby('key').sum() I know we can do things like:

>> 
    data
key 
A   3
B   5
C   7

What is the easiest way to get all the 'splitted' data in an array?:

>> 
    data
key 
A   [0, 3]
B   [1, 4]
C   [2, 5]

I'm not necessarily grouping by just one key, but with several other indexes as well ('year' and 'month' for example) which is why I'd like to use the groupby function, but preserve all the grouped values in an array.

like image 671
ru111 Avatar asked Mar 12 '19 15:03

ru111


People also ask

Does pandas Groupby return series?

When the series are of different lengths, it returns a multi-indexed series. This returns a a Series object. However, if every series has the same length, then it pivots this into a DataFrame .

How do I turn a Groupby object into a list?

You can group DataFrame rows into a list by using pandas. DataFrame. groupby() function on the column of interest, select the column you want as a list from group and then use Series. apply(list) to get the list for every group.


1 Answers

You can use apply(list):

print(df.groupby('key').data.apply(list).reset_index())

  key    data
0   A  [0, 3]
1   B  [1, 4]
2   C  [2, 5]
like image 177
anky Avatar answered Nov 10 '22 19:11

anky