Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

efficient way to update dataframe column based on condition

I want to update the values of one dataframe if conditions from a second frame applies.

Right now I'm looping through each row of the second frame to find it's location in the first frame and update that row when it's found but this is very resource inefficient.

Is there a better way of merging this?

For example, I want to update resolve to "yes" in T1 for values found in T2

T1
user_id   version        issue         Resolved
1         12345           crash           no   
2         12346           printing        no 
3         12347           screen          no
3         12348           screen          no


T2
user_id   version        issue 
1         12345           crash              
2         12346           printing              


Thanks!!

expected output:

T1
user_id   version        issue         Resolved
1         12345           crash           yes   
2         12346           printing        yes 
3         12347           screen          no
3         12348           screen          no
like image 614
DiSt8885 Avatar asked Mar 21 '26 15:03

DiSt8885


1 Answers

You can add a yes resolved to your new data, then combine the two

T2['Resolved'] = 'yes'
dtypes = T1.dtypes.combine_first(T2.dtypes)
T3 = T2.combine_first(T1)
for k, v in dtypes.iteritems(): 
    T3[k] = T3[k].astype(v) 

output:

   user_id  version     issue Resolved
0        1    12345     crash      yes
1        2    12346  printing      yes
2        3    12347    screen       no
3        3    12348    screen       no

I found a way to combine without converting back to floats as well:

T3 = T2.astype('object').combine_first(T1)

so you shouldn't need the for loops anymore

Check this solution where the combine_first didn't work. I create a map to the yes and apply them to the first frame:

T1 = pd.DataFrame({'A': [1,1,2,3], 'B': ['no', 'no','no','no'], 'C': ['no1', 'no2','no3','no4']})
T2 = pd.DataFrame({'A': [2, 3], 'B': ['yes','yes']})
mapitems = T2.set_index('A').to_dict()['B']   
T1['B'] = T1.apply(lambda x: mapitems[x.A] if x.A in mapitems.keys() else x.B, axis=1)
like image 195
oppressionslayer Avatar answered Mar 23 '26 03:03

oppressionslayer



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!