Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas groupby concatenate strings in multiple columns

I have this pandas data frame:

df = DataFrame({'id':['a','b','b','b','c','c'], 'category':['z','z','x','y','y','y'], 'category2':['1','2','2','2','1','2']})

which looks like:

  category category2 id
0        z         1  a
1        z         2  b
2        x         2  b
3        y         2  b
4        y         1  c
5        y         2  c

What i'd like to do is to groupby id and return the other two columns as a concatenation of unique strings.

The outcome would look like:

  category category2 id
0        z         1  a
1      zxy         2  b
2        y        12  c
like image 760
Blue Moon Avatar asked Aug 20 '15 12:08

Blue Moon


People also ask

Can you use GroupBy with multiple columns in pandas?

Grouping by Multiple ColumnsYou can do this by passing a list of column names to groupby instead of a single string value.

How do I combine data from multiple columns into one pandas?

By use + operator simply you can combine/merge two or multiple text/string columns in pandas DataFrame. Note that when you apply + operator on numeric columns it actually does addition instead of concatenation.

How do I concatenate multiple rows in pandas?

To achieve this, we group the DataFrame by the ID, and select the Variety column. We transform by applying a lambda function on all the rows. This lambda function concatenates all the values within the group, separated by a comma and space.


1 Answers

Use groupby/agg to aggregate the groups. For each group, apply set to find the unique strings, and ''.join to concatenate the strings:

In [34]: df.groupby('id').agg(lambda x: ''.join(set(x)))
Out[34]: 
   category category2
id                   
a         z         1
b       yxz         2
c         y        12

To move id from the index to a column of the resultant DataFrame, call reset_index:

In [59]: df.groupby('id').agg(lambda x: ''.join(set(x))).reset_index()
Out[59]: 
  id category category2
0  a        z         1
1  b      yxz         2
2  c        y        12
like image 81
unutbu Avatar answered Nov 10 '22 00:11

unutbu