Replace strings using a dictionary without deleting characters in a pandas dataframe

Question

I have a kind of lookup problem where I have tried to use functions replace dict zip (see below) but that does not exactly produce my desired result because characters (underscores) are removed in the process.

Questions

What is an efficient way to generate df3 without removing underscores in df1? In my real problem df1 is larger, at least (200, 500) and not (2, 4) as in the example below.
To create df3, why can't I use replace dict zip as below, without removing underscores in df1?

df1 contains unique strings with underscores arranged in a specific pattern:

import pandas as pd
df1 = pd.DataFrame([['1_1','1_2', '2_1', '2_2'],['1_3','1_4', '2_3', '2_4']])
df1
         0    1    2    3
    0  1_1  1_2  2_1  2_2
    1  1_3  1_4  2_3  2_4

df2 contains a dictionary for some of the strings in df1:

df2 = pd.DataFrame([['1_1',234],['1_2',456],['2_3',324],['2_4',765]], columns = ['a', 'b'])
df2

     a    b
0  1_1  234
1  1_2  456
2  2_3  324
3  2_4  765

I want to create df3 where exact strings contained in df1 are replaced with the corresponding value in df2.b. However, when I run the following code the underscores in df3 for 2_1, 2_2 etc disappear for strings not contained in df2.

df3 = df1.replace(dict(zip(df2.a, df2.b)))
df3

     0    1    2    3
0  234  456   21   22
1   13   14  324  765

The desired result in df3 should instead be:

     0    1    2    3
0  234  456   2_1   2_2
1   1_3   1_4  324  765

Or, alternatively:

     0    1    2    3
0  234  456   NaN   NaN
1   NaN   NaN  324  765

anky · Accepted Answer

You can use df.mask as an alternative:

s=df2.set_index('a')['b']
df1.mask(df1.isin(s.index),df1.replace(s))

     0    1    2    3
0  234  456  2_1  2_2
1  1_3  1_4  324  765

Replace strings using a dictionary without deleting characters in a pandas dataframe

Tags:

python

dictionary

replace

pandas

Per Stattin

1 Answers

anky

Recent Activity

Donate For Us

Replace strings using a dictionary without deleting characters in a pandas dataframe

Tags:

python

dictionary

replace

pandas

Per Stattin

1 Answers

anky

Related questions

Recent Activity

Donate For Us