Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't you replace integers with lists using `replace` method - pandas

So let's say i have a pandas data-frame as below:

df=pd.DataFrame({'a':[1,2,3,0]})

So my goal is to replace 0 value with [] (empty list) in this dataframe, but i did:

print(df.replace(0,[]))

But it gives me an error:

TypeError: Invalid "to_replace" type: 'int'

I tried everything that's possible i.e:

df[df==0]=[]

etc...

But nothing works.

Desired output (in case for confusion):

   a
0  1
1  2
2  3
3 []
like image 497
U12-Forward Avatar asked Nov 21 '18 10:11

U12-Forward


People also ask

How do I use the Replace method in pandas?

Pandas DataFrame replace() MethodThe replace() method replaces the specified value with another specified value. The replace() method searches the entire DataFrame and replaces every case of the specified value.

How do I change the values in pandas series based on conditions?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.


2 Answers

It is possible by list comprehension, but because mixed content - numeric with list it is not recommended:

df['a'] = [[] if x == 0 else x for x in df.a]

print (df)

    a
0   1
1   2
2   3
3  []

And replace all values in all columns:

df = df.applymap(lambda x: [] if x == 0 else x)
print (df)
    a
0   1
1   2
2   3
3  []
like image 71
jezrael Avatar answered Oct 06 '22 03:10

jezrael


There are two issues here. First is pandas' quirkiness when dealing with lists. To replace values in a DataFrame with list you need to do something like this;

df.loc[df.a == 0, "a"] = [[] for _ in df[df.a == 0]]

This creates n empty list based on the number of items that matches the criteria (df == 0)

The second issue is that your column is of integer type, and you can't store a list in an integer column. So before you can assign the list, you would first need to convert the column type to object first.

df = df.astype(object)
df.loc[df.a == 0, "a"] = [[] for _ in df[df.a == 0]]
like image 39
Lie Ryan Avatar answered Oct 06 '22 02:10

Lie Ryan