I have a dataframe set up in the following way:
header_1 | header_2 | header_3 | header_4
a b NaN NaN
b c 9 10
x y NaN 8
How can I select using column indexes (the name of the columns change) the rows where header_3 and header_4 are BOTH not NaN? header_3 and header_4 are integers
Thank you
If possible multiple columns defined in list check not missing values of filtered columns with DataFrame.all
for check all True
s per rows:
cols = ['header_3','header_4']
df = df[df[cols].notnull().all(axis=1)]
print (df)
header_1 header_2 header_3 header_4
1 b c 9.0 10.0
# df[df[['header_3', 'header_4']].notnull().all(axis=1)] # Just to avoid creating a list of cols and calling that.
For select by last 2 columns use iloc
for select by positions:
df = df[df.iloc[:, -2:].notnull().all(axis=1)]
Also is possible specify columns by indexers:
#python count from 0
df = df[df.iloc[:, [2,3]].notnull().all(axis=1)]
# df[df.loc[:, ['header_3', 'header_4']].notnull().all(axis=1)] # or can use loc with direct columns name
Or if only 2 columns chain conditions with &
for bitwise AND
:
df = df[df['header_3'].notnull() & df['header_4'].notnull()]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With