I have the foll. dictionary in python:
OrderedDict([(30, ('A1', 55.0)), (31, ('A2', 125.0)), (32, ('A3', 180.0)), (43, ('A4', nan))])
Is there a way to remove the entries where any of the values is NaN? I tried this:
{k: dict_cg[k] for k in dict_cg.values() if not np.isnan(k)}
It would be great if the soln works for both python 2 and python 3
By using dropna() method you can drop rows with NaN (Not a Number) and None values from pandas DataFrame. Note that by default it returns the copy of the DataFrame after removing rows. If you wanted to remove from the existing DataFrame, you should use inplace=True .
We can replace the NaN with an empty string using df. replace() function. This function will replace an empty string inplace of the NaN value.
The del keyword can be used to in-place delete the key that is present in the dictionary in Python.
Since you have pandas, you can leverage pandas' pd.Series.notnull
function here, which works with mixed dtypes.
>>> import pandas as pd
>>> {k: v for k, v in dict_cg.items() if pd.Series(v).notna().all()}
{30: ('A1', 55.0), 31: ('A2', 125.0), 32: ('A3', 180.0)}
This is not part of the answer, but may help you understand how I've arrived at the solution. I came across some weird behaviour when trying to solve this question, using pd.notnull
directly.
Take dict_cg[43]
.
>>> dict_cg[43]
('A4', nan)
pd.notnull
does not work.
>>> pd.notnull(dict_cg[43])
True
It treats the tuple as a single value (rather than an iterable of values). Furthermore, converting this to a list and then testing also gives an incorrect answer.
>>> pd.notnull(list(dict_cg[43]))
array([ True, True])
Since the second value is nan
, the result I'm looking for should be [True, False]
. It finally works when you pre-convert to a Series:
>>> pd.Series(dict_cg[43]).notnull()
0 True
1 False
dtype: bool
So, the solution is to Series-ify it and then test the values.
Along similar lines, another (admittedly roundabout) solution is to pre-convert to an object
dtype numpy array, and pd.notnull
will work directly:
>>> pd.notnull(np.array(dict_cg[43], dtype=object))
Out[151]: array([True, False])
I imagine that pd.notnull
directly converts dict_cg[43]
to a string array under the covers, rendering the NaN as a string "nan", so it is no longer a "null" value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With