Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

create new column based on condition in column of lists in pandas

Tags:

I have a dataframe containing column of lists:

col_1            
[A, A, A, B, C]
[D, B, C]
[C]
[A, A, A]
NaN

I want to create new column, if the list starts with 3* A return 1, if not return 0:

col_1              new_col           
[A, A, A, B, C]    1
[D, B, C]          0
[C]                0
[A, A, A]          1
NaN                0

I tried this but didn't work:

df['new_col'] = df.loc[df.col_1[0:3] == [A, A, A]]
like image 238
khaled koubaa Avatar asked Sep 25 '20 12:09

khaled koubaa


People also ask

How do I add a column based on a condition?

Select Add Column > Custom Column. Enter "Bonus" in the New column name text box.

How do you create a conditional column in Python?

You can create a conditional DataFrame column by checking multiple columns using numpy. select() function. The select() function is more capable than the previous methods. We can use it to give a set of conditions and a set of values.


2 Answers

Beause there are some non list values is possible use if-else lambda function for 0 if not list:

print (df['col_1'].map(type))
0     <class 'list'>
1     <class 'list'>
2     <class 'list'>
3     <class 'list'>
4    <class 'float'>
Name: col_1, dtype: object

f = lambda x: int((x[:3]) == ['A','A','A']) if isinstance(x, list) else 0
df['new_col'] = df['col_1'].map(f)
#alternative
#df['new_col'] = df['col_1'].apply(f)
print (df)
             col_1  new_col
0  [A, A, A, B, C]        1
1        [D, B, C]        0
2              [C]        0
3        [A, A, A]        1
4              NaN        0
like image 123
jezrael Avatar answered Oct 02 '22 15:10

jezrael


Here is another potential solution using map:

import pandas as pd

#borrowing dataframe from @Alexendra
df = pd.DataFrame({
    'col_1': [
      ['A', 'A', 'A', 'B', 'C'],
      ['D', 'B', 'C'],
      ['C'],
      ['A', 'A', 'A']
    ]
})

df['new_col'] = df['col_1'].map(lambda x : 1  if x[0:3] == ['A','A','A']   else 0)

print(df)

Output:

             col_1  new_col
0  [A, A, A, B, C]        1
1        [D, B, C]        0
2              [C]        0
3        [A, A, A]        1
like image 41
Grayrigel Avatar answered Oct 02 '22 16:10

Grayrigel