I have been getting the following error, when I am trying to conditionally update a dataframe column from values from another column.
ValueError: cannot set using a multi-index selection indexer with a different length than the value.
I haven't been able to figure out the cause after spending hours. Here is the simplified code to demonstrate the issue:
dfm = pd.DataFrame({'jim': [0, 0, 1, 1],
'joe': ['x', 'y', 'z', 'y'],
'jolie': np.random.rand(4),
'folie': np.random.rand(4)})
dfm = dfm.set_index(['jim', 'joe'])
dfm.loc[(dfm['jolie'] == 1) , 'jolie'] = dfm['folie']
As soon as I remove the index the last line of code above works. My questions is: What am I doing wrong? Can the above code be fixed without removing the index? Is this a bug in pandas? I would appreciate your help.
The issue here might very well be because the length of dfm.loc[(dfm['jolie'] == 1) , 'jolie']
is different that that of dfm['folie']
since the former only looks at a sub-series of dfm['jolie']
.
In addition, when assigning values of a series to another, the indexes of the two must match, whether they are single- or multi-index.
For example, the following would work:
dfm.loc[(dfm['jolie'] == 1) , 'jolie'] = dfm.loc[(dfm['jolie'] == 1) ,'folie']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With