Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assign a Series to several Rows of a Pandas DataFrame

Tags:

python

pandas

I have a pandas DataFrame prepared with an Index and columns, all values are NaN. Now I computed a result, which can be used for more than one row of a DataFrame, and I would like to assign them all at once. This can be done by a loop, but I am pretty sure that this assignment can be done at once.

Here is a scenario:

import pandas as pd
df = pd.DataFrame(index=['A', 'B', 'C'], columns=['C1', 'C2'])  # original df
s = pd.Series({'C1': 1, 'C2': 'ham'})  # a computed result
index = pd.Index(['A', 'C'])  # result is valid for rows 'A' and 'C'

The naive approach is

df.loc[index, :] = s

But this does not change the DataFrame at all. It remains as

    C1   C2
A  NaN  NaN
B  NaN  NaN
C  NaN  NaN

How can this assignment be done?

like image 523
Nras Avatar asked Jun 28 '17 14:06

Nras


1 Answers

It seems we can use the underlying array data to assign -

df.loc[index, :] = s.values

Now, this assumes that the order of index in s is same as in the columns of df. If that's not the case, as suggested by @Nras, we could use s[df.columns].values for right side assignment.

like image 134
Divakar Avatar answered Oct 25 '22 23:10

Divakar