Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to filter in NaN (pandas)?

Tags:

python

pandas

nan

I have a pandas dataframe (df), and I want to do something like:

newdf = df[(df.var1 == 'a') & (df.var2 == NaN)] 

I've tried replacing NaN with np.NaN, or 'NaN' or 'nan' etc, but nothing evaluates to True. There's no pd.NaN.

I can use df.fillna(np.nan) before evaluating the above expression but that feels hackish and I wonder if it will interfere with other pandas operations that rely on being able to identify pandas-format NaN's later.

I get the feeling there should be an easy answer to this question, but somehow it has eluded me. Any advice is appreciated. Thank you.

like image 982
Gerhard Avatar asked Jul 31 '14 02:07

Gerhard


People also ask

How do you filter null values in a DataFrame?

In Spark, using filter() or where() functions of DataFrame we can filter rows with NULL values by checking IS NULL or isNULL . These removes all rows with null values on state column and returns the new DataFrame. All above examples returns the same output.

IS NOT null pandas filter?

To filter out the rows of pandas dataframe that has missing values in Last_Namecolumn, we will first find the index of the column with non null values with pandas notnull() function. It will return a boolean series, where True for not null and False for null values or missing values.


1 Answers

Simplest of all solutions:

filtered_df = df[df['var2'].isnull()] 

This filters and gives you rows which has only NaN values in 'var2' column.

like image 200
Gil Baggio Avatar answered Sep 28 '22 21:09

Gil Baggio