Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

update dataframe with series

having a dataframe, I want to update subset of columns with a series of same length as number of columns being updated:

>>> df = pd.DataFrame(np.random.randint(0,5,(6, 2)), columns=['col1','col2'])
>>> df

   col1  col2
0     1     0
1     2     4
2     4     4
3     4     0
4     0     0
5     3     1

>>> df.loc[:,['col1','col2']] = pd.Series([0,1])
...
ValueError: shape mismatch: value array of shape (6,) could not be broadcast to indexing result of shape (2,6)

it fails, however, I am able to do the same thing using list:

>>> df.loc[:,['col1','col2']] = list(pd.Series([0,1]))
>>> df
   col1  col2
0     0     1
1     0     1
2     0     1
3     0     1
4     0     1
5     0     1

could you please help me to understand, why updating with series fails? do I have to perform some particular reshaping?

like image 393
kekert Avatar asked Aug 31 '16 08:08

kekert


People also ask

How do I update pandas series?

update() function has successfully updated the values in the original series object from the passed series object. Output : Now we will use Series. update() function to update the values identified the passed indexed in the given Series object.

How do I change the values in pandas series based on conditions?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.


1 Answers

When assigning with a pandas object, pandas treats the assignment more "rigorously". A pandas to pandas assignment must pass stricter protocols. Only when you turn it to a list (or equivalently pd.Series([0, 1]).values) did pandas give in and allow you to assign in the way you'd imagine it should work.

That higher standard of assignment requires that the indices line up as well, so even if you had the right shape, it still wouldn't have worked without the correct indices.

df.loc[:, ['col1', 'col2']] = pd.DataFrame([[0, 1] for _ in range(6)])
df

enter image description here

df.loc[:, ['col1', 'col2']] = pd.DataFrame([[0, 1] for _ in range(6)], columns=['col1', 'col2'])
df

enter image description here

like image 116
piRSquared Avatar answered Oct 25 '22 17:10

piRSquared