convert multi-categorical column into two category in pandas

Question

I have a dataframe as shown below.

df:

ID      tag              
1       pandas
2       numpy
3       matplotlib
4       pandas
5       pandas
6       sns
7       sklearn
8       sklearn
9       pandas
10      pandas

to the above df, I would like to add a column named tag_binary. Which will whether it is pandas or not.

Expected output:

ID      tag            tag_binary         
1       pandas         pandas
2       numpy          non_pandas
3       matplotlib     non_pandas
4       pandas         pandas
5       pandas         pandas
6       sns            non_pandas
7       sklearn        non_pandas
8       sklearn        non_pandas
9       pandas         pandas
10      pandas         pandas

I tried the below code using a dictionary and map function. It worked fine. But I am wondering is there any easier way without creating this complete dictionary.

d = {'pandas':'pandas', 'numpy':'non_pandas', 'matplotlib':'non_pandas',
    'sns':'non_pandas', 'sklearn':'non_pandas'}
df["tag_binary"] = df['tag'].map(d)

ALollz · Accepted Answer

You can use where with an equality check to keep 'pandas' and fill everything else with 'non_pandas'.

df['tag_binary'] = df['tag'].where(df['tag'].eq('pandas'), 'non_pandas')

   ID         tag    tag_binary
0   1      pandas        pandas
1   2       numpy    non_pandas
2   3  matplotlib    non_pandas
3   4      pandas        pandas
4   5      pandas        pandas
5   6         sns    non_pandas
6   7     sklearn    non_pandas
7   8     sklearn    non_pandas
8   9      pandas        pandas
9  10      pandas        pandas

If you want something a little more flexible, so you can also map specific values to some label, then you can leverage the fact that for keys not in your dict, map returns NaN. So only specify mappings you care about and then fillna to deal with every other case.

# Could be more general like {'pandas': 'pandas', 'geopandas': 'pandas'}
d = {'pandas': 'pandas'} 
df['pandas_binary'] = df['tag'].map(d).fillna('non_pandas')

convert multi-categorical column into two category in pandas

Tags:

python-3.x

pandas

dataframe

Danish

1 Answers

ALollz

Recent Activity

Donate For Us

convert multi-categorical column into two category in pandas

Tags:

python-3.x

pandas

dataframe

Danish

1 Answers

ALollz

Related questions

Recent Activity

Donate For Us