Given a dataframe structured like:
rule_id | ordering | sequence_id
1 | 0 | 12
1 | 1 | 13
1 | 1 | 14
2 | 0 | 1
2 | 1 | 2
2 | 2 | 12
I need to transform it into:
rule_id | sequences
1 | [[12],[13,14]]
2 | [[1],[2],[12]]
that seems like easy groupby into groupby to list operation - I can not however make it work in pandas.
df.groupby(['rule_id', 'ordering'])['sequence_id'].apply(list)
leaves me with
rule_id ordering
1 0 [12]
1 [13,14]
2 0 [1]
1 [2]
2 [12]
How does one apply another groupBy
operation to furtherly concat results into one list?
You can group DataFrame rows into a list by using pandas. DataFrame. groupby() function on the column of interest, select the column you want as a list from group and then use Series. apply(list) to get the list for every group.
The Hello, World! of pandas GroupBy You call . groupby() and pass the name of the column that you want to group on, which is "state" . Then, you use ["last_name"] to specify the columns on which you want to perform the actual aggregation. You can pass a lot more than just a single column name to .
To merge rows within a group together in Pandas we can use the agg(~) method together with the join(~) method to concatenate the row values.
Use another groupby
by first level of MultiIndex
:
df.groupby(['rule_id', 'ordering'])['sequence_id'].apply(list).groupby(level=0).apply(list)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With