Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace NaN's in one column with string, based on value in another column

Tags:

python

pandas

Simply, where column B = 't3', I want to replace the NaN value in column A with a new string.

My attempts below have all failed.

d = pd.DataFrame({"A":[np.nan, 't2', np.nan, 't3', np.nan], "B":['t1', 't2', 't3', 't4', 't3']})
print "Original Dataframe:\n", d

# Does not work
d[d.B == 't3'].A = 'new_val'

# Does not work
d[d.B == 't3'].A.replace(np.nan, 'new_val')


# Does not work
d[d.B == 't3'].A.replace(np.nan, 'new_val', inplace=True)

print "Final Dataframe:\n", d

Here's the output:

Original Dataframe:
     A   B
0  NaN  t1
1   t2  t2
2  NaN  t3
3   t3  t4
4  NaN  t3

[5 rows x 2 columns]
Final Dataframe:
     A   B
0  NaN  t1
1   t2  t2
2  NaN  t3
3   t3  t4
4  NaN  t3
like image 211
zbinsd Avatar asked Mar 21 '23 12:03

zbinsd


1 Answers

Use loc see http://pandas.pydata.org/pandas-docs/stable/indexing.html#different-choices-for-indexing-loc-iloc-and-ix

In [5]:

d.loc[(d['A'].isnull()) & (d.B == 't3'), 'A']='new_val'

d

Out[5]:

         A   B
0      NaN  t1
1       t2  t2
2  new_val  t3
3       t3  t4
4  new_val  t3

[5 rows x 2 columns]
like image 159
EdChum Avatar answered Apr 06 '23 17:04

EdChum