Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to reclassify pandas dataframe column?

I have a Pandas dataframe that looks something like this:

> print(df)

           image_name                       tags
0                img1       class1 class2 class3
1                img2                     class2
2                img3              class2 class3
3                img4                     class1

How can I reclassify the tags column such that any row with a class3 value gets assigned the string "yes" and everything else the string "no"?

I am aware that I can check for instances of a search word using the following:

df['tags'].str.contains('class3')

However, I am not sure how to integrate this into the task at hand.

The following is the intended output:

           image_name                       tags
0                img1                        yes
1                img2                         no
2                img3                        yes
3                img4                         no
like image 434
Borealis Avatar asked Jan 02 '23 19:01

Borealis


2 Answers

Use np.where as:

df['tags'] = np.where(df['tags'].str.contains('class3'),'yes','no')

Or

df['tags'] = 'no'
df.loc[df['tags'].str.contains('class3'),'tags'] = 'yes'

Or

df['tags'] = ['yes' if 'class3' in s else 'no' for s in df3.tags.values]

The output for above methods:

print(df)
  image_name tags
0       img1  yes
1       img2   no
2       img3  yes
3       img4   no
like image 56
Space Impact Avatar answered Jan 13 '23 02:01

Space Impact


You can also do:

df['tags'] = df.tags.str.contains('class3').map({True:'Yes',False:'No'})
>>> df
  image_name tags
0       img1  Yes
1       img2   No
2       img3  Yes
3       img4   No
like image 34
sacuL Avatar answered Jan 13 '23 00:01

sacuL