I have a dataframe as follows:
ID Date Text
1 01/01/2019 abcd
1 01/01/2019 pqrs
2 01/02/2019 abcd
2 01/02/2019 xyze
I want to merge Text
by ID
in Python using group by clause.
I want to merge 'Text
' columns by grouping ID
.
ID Date Text
1 01/01/2019 abcdpqrs
2 01/02/2019 abcdxyze
I want to do this in Python.
I have attempted following code chunks but it didn't work:
groups = groupby(dataset_new, key=ID(1))
dataset_new.group_by{row['Reference']}.values.each do |group|
puts [group.first['Reference'], group.map{|r| r['Text']} * ' '] * ' | '
end
I also attempted to merge text in excel using formulas but it is also not giving required results.
Try groupby
and sum
. Judging from your error message and the output of df.info()
it seems there are mixed dtypes and NaN
in column Text
. I suggest converting NaN
to empty string using fillna('')
, then convert all elements in the column to string using astype(str)
.
df = pd.DataFrame({'ID': [1,1,2,2],
'Date': ['01/01/2019', '01/01/2019', '01/02/2019', '01/02/2019'],
'Text': ['abcd', 'pqrs', 'abcd', 'xyze']})
df['Text'] = df['Text'].fillna('').astype(str)
df_grouped = df.groupby(['ID', 'Date'])['Text'].sum()
print(df_grouped)
This should return
ID Date
1 01/01/2019 abcdpqrs
2 01/02/2019 abcdxyze
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With