I have a dataframe where I want to replace values in a column, but the dict describing the replacement is based on values in another column. A sample dataframe would look like this:
Map me strings date
0 1 test1 2020-01-01
1 2 test2 2020-02-10
2 3 test3 2020-01-01
3 4 test2 2020-03-15
I have a dictionary that looks like this:
map_dict = {'2020-01-01': {1: 4, 2: 3, 3: 1, 4: 2},
'2020-02-10': {1: 3, 2: 4, 3: 1, 4: 2},
'2020-03-15': {1: 3, 2: 2, 3: 1, 4: 4}}
Where I want the mapping logic to be different based on the date.
In this example, the expected output would be:
Map me strings date
0 4 test1 2020-01-01
1 4 test2 2020-02-10
2 1 test3 2020-01-01
3 4 test2 2020-03-15
I have a massive dataframe (100M+ rows) so I really want to avoid any looping solutions if at all possible.
I have tried to think of a way to use either map or replace but have been unsuccessful
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
You can use df. replace({"Courses": dict}) to remap/replace values in pandas DataFrame with Dictionary values. It allows you the flexibility to replace the column values with regular expressions for regex substitutions.
if we want to modify the value of the cell [0,"A"] u can use one of those solution : df. iat[0,0] = 2. df.at[0,'A'] = 2.
Use DataFrame.join
with MultiIndex Series
created by DataFrame
cosntructor and DataFrame.stack
:
df = df.join(pd.DataFrame(map_dict).stack().rename('new'), on=['Map me','date'])
print (df)
Map me strings date new
0 1 test1 2020-01-01 4
1 2 test2 2020-02-10 4
2 3 test3 2020-01-01 1
3 4 test2 2020-03-15 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With