Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing a nan from a list

Tags:

python

nan

While trying to work on a project with pandas I have run into a problem. I had a list with a nan value in it and I couldn’t remove it.

I have tried:

incoms=data['int_income'].unique().tolist()
incoms.remove('nan')

But it didn’t work:

list.remove(x): x not in list"

The list incoms is as follows:

[75000.0, 50000.0, 0.0, 200000.0, 100000.0, 25000.0, nan, 10000.0, 175000.0, 150000.0, 125000.0]
like image 264
Moran Reznik Avatar asked Aug 15 '17 14:08

Moran Reznik


People also ask

Is NaN in list Python?

Nan implies “not a number” in python language. It is usually a float-type value that does not exist in data. Due to this reason, data users must remove “nan” values. There are numerous approaches available to remove “nan” values from a list data structure.

Why am I getting NaN in Python?

Nan means “Not a number”, this is because inside your cube function, you're not calling the square function, but getting it's contents. Change return x * square; with return x * square(x); and it should work.


3 Answers

I think you need dropna for remove NaNs:

incoms=data['int_income'].dropna().unique().tolist()
print (incoms)
[75000.0, 50000.0, 0.0, 200000.0, 100000.0, 25000.0, 10000.0, 175000.0, 150000.0, 125000.0]

And if all values are integers only:

incoms=data['int_income'].dropna().astype(int).unique().tolist()
print (incoms)
[75000, 50000, 0, 200000, 100000, 25000, 10000, 175000, 150000, 125000]

Or remove NaNs by selecting all non NaN values by numpy.isnan:

a = data['int_income'].unique()
incoms= a[~np.isnan(a)].tolist()
print (incoms)
[75000.0, 50000.0, 0.0, 200000.0, 100000.0, 25000.0, 10000.0, 175000.0, 150000.0, 125000.0]

a = data['int_income'].unique()
incoms= a[~np.isnan(a)].astype(int).tolist()
print (incoms)
[75000, 50000, 0, 200000, 100000, 25000, 10000, 175000, 150000, 125000]

Pure python solution - slowier if big DataFrame:

incoms=[x for x in  list(set(data['int_income'])) if pd.notnull(x)]
print (incoms)
[0.0, 100000.0, 200000.0, 25000.0, 125000.0, 50000.0, 10000.0, 150000.0, 175000.0, 75000.0]

incoms=[int(x) for x in  list(set(data['int_income'])) if pd.notnull(x)]
print (incoms)
[0, 100000, 200000, 25000, 125000, 50000, 10000, 150000, 175000, 75000]
like image 190
jezrael Avatar answered Oct 19 '22 05:10

jezrael


A possibility in that particular case is to remove nans earlier to avoid to do it in the list:

incoms=data['int_income'].dropna().unique().tolist()
like image 38
Rafael Valero Avatar answered Oct 19 '22 06:10

Rafael Valero


What you can do is simply get a cleaned list where you don't put the values that, once converted to strings, are 'nan'.

The code would be :

incoms = [incom for incom in incoms if str(incom) != 'nan']
like image 17
zoubida13 Avatar answered Oct 19 '22 05:10

zoubida13