Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

replace empty list with NaN in pandas dataframe

Tags:

python

pandas

I'm trying to replace some empty list in my data with a NaN values. But how to represent an empty list in the expression?

import numpy as np
import pandas as pd
d = pd.DataFrame({'x' : [[1,2,3], [1,2], ["text"], []], 'y' : [1,2,3,4]})
d

    x           y
0   [1, 2, 3]   1
1   [1, 2]      2
2   [text]      3
3   []          4



d.loc[d['x'] == [],['x']] = d.loc[d['x'] == [],'x'].apply(lambda x: np.nan)
d

ValueError: Arrays were different lengths: 4 vs 0

And, I want to select [text] by using d[d['x'] == ["text"]] with a ValueError: Arrays were different lengths: 4 vs 1 error, but select 3 by using d[d['y'] == 3] is correct. Why?

like image 640
running man Avatar asked Nov 26 '16 13:11

running man


People also ask

How do I fill blank cells in pandas DataFrame with NaN?

Pandas Replace Blank Values with NaN using mask() You can also replace blank values with NAN with DataFrame. mask() methods. The mask() method replaces the values of the rows where the condition evaluates to True.

How do you replace NaN values with an empty list in Python?

Just use [[]]*s. isna().


2 Answers

If you wish to replace empty lists in the column x with numpy nan's, you can do the following:

d.x = d.x.apply(lambda y: np.nan if len(y)==0 else y)

If you want to subset the dataframe on rows equal to ['text'], try the following:

d[[y==['text'] for y in d.x]]

I hope this helps.

like image 165
Abdou Avatar answered Oct 09 '22 11:10

Abdou


You can use function "apply" to match the specified cell value no matter it is the instance of string, list and so on.

For example, in your case:

import pandas as pd
d = pd.DataFrame({'x' : [[1,2,3], [1,2], ["text"], []], 'y' : [1,2,3,4]})
d
    x           y
0   [1, 2, 3]   1
1   [1, 2]      2
2   [text]      3
3   []          4

if you use d == 3 to select the cell whose value is 3, it's totally ok:

      x       y
0   False   False
1   False   False
2   False   True
3   False   False

However, if you use the equal sign to match a list, there may be out of your exception, like d == [text] or d == ['text'] or d == '[text]', such as the following: enter image description here

There's some solutions:

  1. Use function apply() on the specified Series in your Dataframe just like the answer on the top:

enter image description here

  1. A more general method with the function applymap() on a Dataframe may be used for the preprocessing step:

    d.applymap(lambda x: x == [])

      x       y
    

    0 False False 1 False False 2 False False 3 True False

Wish it can help you and the following learners and it would be better if you add a type check in you applymap function which would otherwise cause some exceptions probably.

like image 27
Shawn Mark Avatar answered Oct 09 '22 09:10

Shawn Mark