Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Querying for NaN and other names in Pandas

Tags:

python

pandas

Say I have a dataframe df with a column value holding some float values and some NaN. How can I get the part of the dataframe where we have NaN using the query syntax?

The following, for example, does not work:

df.query( '(value < 10) or (value == NaN)' )

I get name NaN is not defined (same for df.query('value ==NaN'))

Generally speaking, is there any way to use numpy names in query, such as inf, nan, pi, e, etc.?

like image 436
Amelio Vazquez-Reina Avatar asked Oct 23 '14 19:10

Amelio Vazquez-Reina


People also ask

Does Panda read NaN na?

This is what Pandas documentation gives: na_values : scalar, str, list-like, or dict, optional Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values. By default the following values are interpreted as NaN: '', '#N/A', '#N/A N/A', '#NA', '-1.

How do I replace words with NaN pandas?

Use df. replace(np. nan,'',regex=True) method to replace all NaN values to an empty string in the Pandas DataFrame column.


3 Answers

In general, you could use @local_variable_name, so something like

>>> pi = np.pi; nan = np.nan
>>> df = pd.DataFrame({"value": [3,4,9,10,11,np.nan,12]})
>>> df.query("(value < 10) and (value > @pi)")
   value
1      4
2      9

would work, but nan isn't equal to itself, so value == NaN will always be false. One way to hack around this is to use that fact, and use value != value as an isnan check. We have

>>> df.query("(value < 10) or (value == @nan)")
   value
0      3
1      4
2      9

but

>>> df.query("(value < 10) or (value != value)")
   value
0      3
1      4
2      9
5    NaN
like image 81
DSM Avatar answered Oct 20 '22 06:10

DSM


According to this answer you can use:

df.query('value < 10 | value.isnull()', engine='python')

I verified that it works.

like image 45
Eric Ness Avatar answered Oct 20 '22 05:10

Eric Ness


For rows where value is not null

df.query("value == value")

For rows where value is null

df.query("value != value")
like image 20
as - if Avatar answered Oct 20 '22 06:10

as - if