I have a data frame that looks like this:
and I want to loop through each row and print the [i,j] position of a non-NaN entry. here, the loop would ideally print "G56" and "G51".
So far I have created a T/F data frame that records all non-NaNs as True:
df_na = df.notnull()
and I can get the row index for any Trues:
for index, row in df_na.iterrows():
if row.any() == True:
print(index)
but I can't get the column name. (I'm also concerned with this approach since iterrows() is slower than itertuples().
df = pd.DataFrame(np.nan, range(54, 62), [*'ABCDEFGHIJ'])
df.at[56, 'G'] = 3
df.at[61, 'G'] = 2
any
with axis=1
df.index[df.notna().any(1)]
Int64Index([56, 61], dtype='int64')
print(*df.index[df.notna().any(1)], sep='\n')
56
61
numpy.where
i, j = np.where(df.notna())
print(*zip(df.index[i], df.columns[j]), sep='\n')
(56, 'G')
(61, 'G')
stack
By default, stack
drops null values
print(*df.stack().index.values, sep='\n')
(56, 'G')
(61, 'G')
Using notnull
return Boolean , then sum
and slice with the index
df.index[df.notnull().sum(1).nonzero()]
Out[646]: Int64Index([56, 61], dtype='int64')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With