I have a dataframe that looks like that:
table = pd.DataFrame({'a':[0,0,0,0],
'b':[1,1,1,3,],
'c':[2,2,5,4],
'd':[3,np.NaN,6,6],
'e':[4,np.NaN, 7,8],
'f':[np.NaN,np.NaN,np.NaN,10,]}, dtype='float64')
a b c d e f
0 0.0 1.0 2.0 3.0 4.0 NaN
1 0.0 1.0 2.0 NaN NaN NaN
2 0.0 1.0 5.0 6.0 7.0 NaN
3 0.0 3.0 4.0 6.0 8.0 10.0
For each row, I'm trying to find the index of the column for the first NaN value. So that I can store that value in a variable to use it later.
So far, I tried this piece of code but it's not giving me exactly what I want.. I don't want an array, just a value.
for i in table.itertuples():
x = np.where(np.isnan(i))
print(x)
(array([6]),)
(array([4, 5, 6]),)
(array([6]),)
(array([], dtype=int64),)
Thanks in advance for any comment/advice !
Check na
, get the index of max value by row and screen out rows that don't have na
at all.
table.isna().idxmax(1).where(table.isna().any(1))
#0 f
#1 d
#2 f
#3 NaN
#dtype: object
Or if you need the column indices, as commented by @hpaulj, you can use argmax
:
import numpy as np
is_missing = table.isna().values
np.where(is_missing.any(1), is_missing.argmax(1), np.nan)
# array([ 5., 3., 5., nan])
Use:
t = np.isnan(table.values).argmax(axis=1)
print (t)
[5 3 5 0]
But if need add one value for non NaN
s rows:
t = np.isnan(table.reset_index().values).argmax(axis=1)
print (t)
[6 4 6 0]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With