Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to concat sets when using groupby in pandas dataframe?

Tags:

python

pandas

This is my dataframe:

> df
       a             b
    0  1         set([2, 3])
    1  2         set([2, 3])
    2  3      set([4, 5, 6])
    3  1  set([1, 34, 3, 2])

Now when I groupby, I want to update sets. If it was a list there was no problem. But the output of my command is:

> df.groupby('a').sum()

a         b                
1             NaN
2     set([2, 3])
3  set([4, 5, 6])  

What should I do in groupby to update sets? The output I'm looking for is as below:

a         b                
1     set([2, 3, 1, 34])
2     set([2, 3])
3     set([4, 5, 6])  
like image 712
Alireza Avatar asked Oct 06 '15 10:10

Alireza


People also ask

How do I concatenate strings in pandas Groupby?

To concatenate strings from several rows using Python Pandas groupby, we can use the transform method. to create the text column that calls groupby on the selected columns name and month . And then we get the text column from the grouped data frame and call transform with a lamnda function to join the strings together.

Can you use Groupby with multiple columns in pandas?

How to groupby multiple columns in pandas DataFrame and compute multiple aggregations? groupby() can take the list of columns to group by multiple columns and use the aggregate functions to apply single or multiple aggregations at the same time.

What is possible using Groupby () method of pandas?

groupby() function is used to split the data into groups based on some criteria. pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names. sort : Sort group keys.


1 Answers

This might be close to what you want

df.groupby('a').apply(lambda x: set.union(*x.b))

In this case it takes the union of the sets.

If you need to keep the column names you could use:

df.groupby('a').agg({'b':lambda x: set.union(*x)}).reset_index('a')

Result:

    a   b
0   1   set([1, 2, 3, 34])
1   2   set([2, 3])
2   3   set([4, 5, 6])
like image 196
matt_s Avatar answered Nov 08 '22 18:11

matt_s