I have a simple apply
function that I execute on some of the columns. But, it keeps getting tripped up by NaN
values in pandas
.
input_data = np.array(
[
[random.randint(0,9) for x in range(2)]+['']+['g'],
[random.randint(0,9) for x in range(3)]+['g'],
[random.randint(0,9) for x in range(3)]+['a'],
[random.randint(0,9) for x in range(3)]+['b'],
[random.randint(0,9) for x in range(3)]+['b']
]
)
input_df = pd.DataFrame(data=input_data, columns=['B', 'C', 'D', 'label'])
I have a simple lambda like this:
input_df['D'].apply(lambda aCode: re.sub('\.', '', aCode) if not np.isnan(aCode) else aCode)
And it gets tripped up by the NaN values:
File "<pyshell#460>", line 1, in <lambda>
input_df['D'].apply(lambda aCode: re.sub('\.', '', aCode) if not np.isnan(aCode) else aCode)
TypeError: Not implemented for this type
So, I tried just testing for nan values that Pandas adds:
np.isnan(input_df['D'].values[0])
np.isnan(input_df['D'].iloc[0])
Both get the same error.
I do not know how to test for nan values other than np.isnan
. Is there an easier way to do this? Thanks.
To check if value at a specific location in Pandas is NaN or not, call numpy. isnan() function with the value passed as argument. If value equals numpy. nan, the expression returns True, else it returns False.
The math. isnan() method checks whether a value is NaN (Not a Number), or not. This method returns True if the specified value is a NaN, otherwise it returns False.
In order to check missing values in Pandas DataFrame, we use a function isnull() and notnull(). Both function help in checking whether a value is NaN or not. These function can also be used in Pandas Series in order to find null values in a series.
By using isnull(). values. any() method you can check if a pandas DataFrame contains NaN / None values in any cell (all rows & columns ). This method returns True if it finds NaN/None on any cell of a DataFrame, returns False when not found.
your code fails because your first entry is an empty string and np.isnan
doesn't understand empty strings:
In [55]:
input_df['D'].iloc[0]
Out[55]:
''
In [56]:
np.isnan('')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-56-a9f139a0c5b8> in <module>()
----> 1 np.isnan('')
TypeError: Not implemented for this type
ps.notnull
does work:
In [57]:
import re
input_df['D'].apply(lambda aCode: re.sub('\.', '', aCode) if pd.notnull(aCode) else aCode)
Out[57]:
0
1 3
2 3
3 0
4 3
Name: D, dtype: object
However, if you just want to replace something then just use .str.replace
:
In [58]:
input_df['D'].str.replace('\.','')
Out[58]:
0
1 3
2 3
3 0
4 3
Name: D, dtype: object
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With