I have a pandas
dataframe with a column that contains lists
:
df = pd.DataFrame({'List': [['once', 'upon'], ['once', 'upon'], ['a', 'time'], ['there', 'was'], ['a', 'time']], 'Count': [2, 3, 4, 1, 2]})
Count List
2 [once, upon]
3 [once, upon]
4 [a, time]
1 [there, was]
2 [a, time]
How can I combine the List
columns and sum the Count
columns? The expected result is:
Count List
5 [once, upon]
6 [a, time]
1 [there, was]
I've tried:
df.groupby('List')['Count'].sum()
which results in:
TypeError: unhashable type: 'list'
One way is to convert to tuples first. This is because pandas.groupby
requires keys to be hashable. Tuples are immutable and hashable, but lists are not.
res = df.groupby(df['List'].map(tuple))['Count'].sum()
Result:
List
(a, time) 6
(once, upon) 5
(there, was) 1
Name: Count, dtype: int64
If you need the result as lists in a dataframe, you can convert back:
res = df.groupby(df['List'].map(tuple))['Count'].sum()
res['List'] = res['List'].map(list)
# List Count
# 0 [a, time] 6
# 1 [once, upon] 5
# 2 [there, was] 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With