I have a problem to solve in my pandas dataframe with Python3. I have two dataframes - the first one is as;
ID Name Linked Model 1 Linked Model 2 Linked Model 3
0 100 A 1111.0 1112.0 NaN
1 101 B 1112.0 1113.0 1115.0
2 102 C NaN NaN NaN
3 103 D 1114.0 NaN NaN
4 104 E 1114.0 1111.0 1112.0
the second one is;
Model ID Name
0 1111 A
1 1112 A,B
2 1113 C
3 1114 D
4 1115 Q
5 1116 Z
6 1117 E
7 1118 W
So the code should look up the value in - for instance in Linked Model 1 column and find the corresponding value in Name column in the second dataframe so that the ID can be replaced with name just like as shown in the result;
So as you can see in the result output, None stays as None (could be replaced numpy N/As) and the names from the second dataframe are now replaced with their corresponding Model IDs in the first dataframe.
I am looking forward to hearing your solutions!
Thanks
Initialise a replacement dictionary and use df.replace
to map those IDs to Names.
m = df2.set_index('Model ID')['Name'].to_dict()
v = df.filter(like='Linked Model')
df[v.columns] = v.replace(m)
df
ID Name Linked Model 1 Linked Model 2 Linked Model 3
0 100 A A A,B NaN
1 101 B A,B C Q
2 102 C NaN NaN NaN
3 103 D D NaN NaN
4 104 E D A A,B
First attempt to answer a python question, so while this is certainly longer than coldspeed's answer, it makes more sense to me using the melt
, merge
, and pivot
funcitons.
import pandas as pd
import numpy as np
# make an object from the first dataset
df_1 = pd.DataFrame(
{"ID" : [100, 101, 102, 103, 104],
"Name" : ["A", "B", "C", "D", "E"],
"Linked Model 1" : [1111, 1112, np.nan, 1114, 1114],
"Linked Model 2" : [1112, 1113, np.nan, np.nan, 1111],
"Linked Model 3" : [np.nan, 1115, np.nan, np.nan, 1112]})
# make an object for the second data set
df_2 = pd.DataFrame(
{"Model ID" : [1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118],
"Name" : ["A", "A,B", "C", "D", "Q", "Z", "E", "W"]})
# tidy the data
df_1 = pd.melt(df_1, ["ID", "Name"])
# left join the second data set
df_1 = pd.merge(df_1, df_2, how='left', left_on='value', right_on='Model ID').reset_index()
#pivot the data back out to achieve the desired format
df_1 = df_1.pivot_table(index='ID',
columns='variable',
values='Name_y',
aggfunc='first',
dropna=False))
variable Linked Model 1 Linked Model 2 Linked Model 3
ID
100 A A,B NaN
101 A,B C Q
102 NaN NaN NaN
103 D NaN NaN
104 D A A,B
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With