Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Adding a df column based on other column with multiple values map to the same new column value

I have a dataframe like this:

df1 = pd.DataFrame({'col1' : ['cat', 'cat', 'dog', 'green', 'blue']})

and I want a new column that gives the category, like this:

dfoutput = pd.DataFrame({'col1' : ['cat', 'cat', 'dog', 'green', 'blue'],
                         'col2' : ['animal', 'animal', 'animal', 'color', 'color']})

I know I could do it inefficiently using .loc:

df1.loc[df1['col1'] == 'cat','col2'] = 'animal'
df1.loc[df1['col1'] == 'dog','col2'] = 'animal'

How do I combine cat and dog to both be animal? This doesn't work:

df1.loc[df1['col1'] == 'cat' | df1['col1'] == 'dog','col2'] = 'animal'
like image 342
Liquidity Avatar asked Jan 04 '19 00:01

Liquidity


1 Answers

Build your dict then do map

d={'dog':'ani','cat':'ani','green':'color','blue':'color'}
df1['col2']=df1.col1.map(d)
df1
    col1   col2
0    cat    ani
1    cat    ani
2    dog    ani
3  green  color
4   blue  color
like image 50
BENY Avatar answered Sep 20 '22 22:09

BENY