This is my dataframe:
> df
a b
0 1 set([2, 3])
1 2 set([2, 3])
2 3 set([4, 5, 6])
3 1 set([1, 34, 3, 2])
Now when I groupby
, I want to update sets. If it was a list
there was no problem. But the output of my command is:
> df.groupby('a').sum()
a b
1 NaN
2 set([2, 3])
3 set([4, 5, 6])
What should I do in groupby to update sets? The output I'm looking for is as below:
a b
1 set([2, 3, 1, 34])
2 set([2, 3])
3 set([4, 5, 6])
To concatenate strings from several rows using Python Pandas groupby, we can use the transform method. to create the text column that calls groupby on the selected columns name and month . And then we get the text column from the grouped data frame and call transform with a lamnda function to join the strings together.
How to groupby multiple columns in pandas DataFrame and compute multiple aggregations? groupby() can take the list of columns to group by multiple columns and use the aggregate functions to apply single or multiple aggregations at the same time.
groupby() function is used to split the data into groups based on some criteria. pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names. sort : Sort group keys.
This might be close to what you want
df.groupby('a').apply(lambda x: set.union(*x.b))
In this case it takes the union of the sets.
If you need to keep the column names you could use:
df.groupby('a').agg({'b':lambda x: set.union(*x)}).reset_index('a')
Result:
a b
0 1 set([1, 2, 3, 34])
1 2 set([2, 3])
2 3 set([4, 5, 6])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With