I have a dataframe with two columns. I want to create a new column and input whichever column has the longest string. so
column_a column_b column_c
0 'dog is fast' 'dog is faster' 'dog is faster' (desired output)
I tried this code but got an error saying that int is not iterable, I was thinking in merging the series after to the df. I wasn't sure how to implement it right away into a column of the df.
column_c = pd.Series()
for i in len(df.column_a):
if len(df.column_a.iloc[i]) >= len(df.column_b.iloc[0]):
column_c.append(df.column_a.iloc[i])
else:
column_c.append(df.column_b.iloc[i])
any help is apreciated.
Use pandas.DataFrame.apply
:
Given sample data
import pandas as pd
df = pd.DataFrame([['fast', 'faster'], ['slower', 'slow']])
0 1
0 fast faster
1 slower slow
df['column_c'] = df.apply(lambda x:max(x, key=len), 1)
Output:
0 1 column_c
0 fast faster faster
1 slower slow slower
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With