I have two numpy arrays that contains NaNs:
A = np.array([np.nan, 2, np.nan, 3, 4])
B = np.array([ 1 , 2, 3 , 4, np.nan])
are there any smart way using numpy to remove the NaNs in both arrays, and also remove whats on the corresponding index in the other list? Making it look like this:
A = array([ 2, 3, ])
B = array([ 2, 4, ])
Droping the missing values or nan values can be done by using the function "numpy. isnan()" it will give us the indexes which are having nan values and when combined with other function which is "numpy. logical_not()" where the boolean values will be reversed.
By using dropna() method you can drop rows with NaN (Not a Number) and None values from pandas DataFrame. Note that by default it returns the copy of the DataFrame after removing rows. If you wanted to remove from the existing DataFrame, you should use inplace=True .
nan_to_num() function is used when we want to replace nan(Not A Number) with zero and inf with finite numbers in an array. It returns (positive) infinity with a very large number and negative infinity with a very small (or negative) number.
What you could do is add the 2 arrays together this will overwrite with NaN values where they are none, then use this to generate a boolean mask index and then use the index to index into your original numpy arrays:
In [193]:
A = np.array([np.nan, 2, np.nan, 3, 4])
B = np.array([ 1 , 2, 3 , 4, np.nan])
idx = np.where(~np.isnan(A+B))
idx
print(A[idx])
print(B[idx])
[ 2. 3.]
[ 2. 4.]
output from A+B
:
In [194]:
A+B
Out[194]:
array([ nan, 4., nan, 7., nan])
EDIT
As @Oliver W. has correctly pointed out, the np.where
is unnecessary as np.isnan
will produce a boolean index that you can use to index into the arrays:
In [199]:
A = np.array([np.nan, 2, np.nan, 3, 4])
B = np.array([ 1 , 2, 3 , 4, np.nan])
idx = (~np.isnan(A+B))
print(A[idx])
print(B[idx])
[ 2. 3.]
[ 2. 4.]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With