I've a pandas.Series
where the dtype for each row is a list object. E.g.
>>> import numpy as np
>>> import pandas as pd
>>> x = pd.Series([[1,2,3], [2,np.nan], [3,4,5,np.nan], [np.nan]])
>>> x
0 [1, 2, 3]
1 [2, nan]
2 [3, 4, 5, nan]
3 [nan]
dtype: object
How do I remove the nan
in the lists for each row?
The desired output would be:
>>> x
0 [1, 2, 3]
1 [2]
2 [3, 4, 5]
3 []
dtype: object
This works:
>>> x.apply(lambda y: pd.Series(y).dropna().values.tolist())
0 [1, 2, 3]
1 [2.0]
2 [3.0, 4.0, 5.0]
3 []
dtype: object
Is there a simpler method than using lambda, converting to the list to a Series, dropping the NaN
and then extracting the values back into a list again?
In the pandas series constructor, the method called dropna() is used to remove missing values from a series object. And it does not update the original series object with removed NaN values instead of updating the original series object, it will return another series object with updated values.
By using dropna() method you can drop rows with NaN (Not a Number) and None values from pandas DataFrame. Note that by default it returns the copy of the DataFrame after removing rows. If you wanted to remove from the existing DataFrame, you should use inplace=True .
To remove nan values from list in python using the math. isnan() function, we will first create an empty list named newList . After that, we will traverse each element of the list using a for loop and check if it is a nan value or not using the math. isnan() function.
fillna() method is used to replace missing values with a specified value. This method replaces the Nan or NA values in the entire series object. Value − it allows us to specify a particular value to replace Nan's, by default it takes None. Method − it is used to fill the missing values in the reindexed Series.
You can use list comprehension
with pandas.notnull
for remove NaN
values:
print (x.apply(lambda y: [a for a in y if pd.notnull(a)]))
0 [1, 2, 3]
1 [2]
2 [3, 4, 5]
3 []
dtype: object
Another solution with filter
with condition where v!=v
only for NaN
:
print (x.apply(lambda a: list(filter(lambda v: v==v, a))))
0 [1, 2, 3]
1 [2]
2 [3, 4, 5]
3 []
dtype: object
Thank you DYZ
for another solution:
print (x.apply(lambda y: list(filter(np.isfinite, y))))
0 [1, 2, 3]
1 [2]
2 [3, 4, 5]
3 []
dtype: object
A simple numpy
solution with list comprehension:
pd.Series([np.array(e)[~np.isnan(e)] for e in x.values])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With