Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas fillna values with numpy array

I have a dataframe as such, it has other columns but this one is important:

df = pd.DataFrame({'A': ['foo','bar','baz','foo','bar','bar','baz','foo']})

I'm trying to create another column then add array values to the new column that match the A column.

New Column: df['B'] = np.nan

Array: arr = np.array([5,3,9])

Attempts

I'm wanting to assign the array to all foo in column A

df['B'] = np.where(df['A']=='foo',arr,np.nan) # ValueError: operands could not be 
                                              # broadcast together with shapes 
                                              # (8,) (3,) () 

I also tried:

df['B'][df['A']=='foo'].values = arr # AttributeError: can't set attribute

Finally,

df['B'] = df['B'][df['A']=='foo'].map(arr) # TypeError: 'numpy.ndarray' object is not callable

Expected output

     A   B
0  foo   5
1  bar NaN
2  baz NaN
3  foo   3
4  bar NaN
5  bar NaN
6  baz NaN
7  foo   9
like image 530
Leb Avatar asked Sep 03 '25 03:09

Leb


1 Answers

If you're sure that arr is the same length as the number of times 'foo' appears, you can use the following to set the values:

df.loc[df['A'] == 'foo', 'B'] = arr

This is a bit like df['B'][df['A']=='foo'] = arr (close to one of the methods you've tried), but avoids chained assignment (which can lead to values not being set correctly, or at all).

like image 125
Alex Riley Avatar answered Sep 05 '25 21:09

Alex Riley