Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a new column on conditional of two other columns pandas

Tags:

python

pandas

I have a dataframe with two columns. I want to create a new column and input whichever column has the longest string. so

        column_a        column_b             column_c

   0  'dog is fast'   'dog is faster'      'dog is faster' (desired output)

I tried this code but got an error saying that int is not iterable, I was thinking in merging the series after to the df. I wasn't sure how to implement it right away into a column of the df.

column_c = pd.Series()

 for i in len(df.column_a):
  if len(df.column_a.iloc[i]) >= len(df.column_b.iloc[0]):
    column_c.append(df.column_a.iloc[i])
  else:
    column_c.append(df.column_b.iloc[i])

any help is apreciated.

like image 893
B B Avatar asked Mar 05 '23 06:03

B B


1 Answers

Use pandas.DataFrame.apply:

Given sample data

import pandas as pd

df = pd.DataFrame([['fast', 'faster'], ['slower', 'slow']])
        0       1
0    fast  faster
1  slower    slow

df['column_c'] = df.apply(lambda x:max(x, key=len), 1)

Output:

        0       1 column_c
0    fast  faster   faster
1  slower    slow   slower
like image 109
Chris Avatar answered May 15 '23 04:05

Chris