Print the colname and rowname for values that meet certain condition

Question

I am desperatly trying to figure out how to print out the row index and col name for specific values in my df.

I have the following df:

raw_data = {'first_name': [NaN, 'Molly', 'Tina', 'Jake', 'Amy'], 
        'last_name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze'], 
        'age': [42, 52, NaN, 24, 73], 
        'preTestScore': [4, 24, 31, 33, 3],
        'postTestScore': [25, 94, 57, 62, 70]}

df = pd.DataFrame(raw_data, columns = ['first_name', 'last_name', 'age', 
'preTestScore','postTestScore'])

I now want to print out the index and column name for the NaN:

There is a missing value in row 0 for first_name.
There is a missing value in row 2 for age.

I have searched a lot and always found how to do something for one row. My idea is to first create a df with False and True

na = df.isnull()

Then I want to apply some function that prints the row number and col_name for every NaN value. I just cant figure out how to do this.

Thanks in advance for any help!

piterbarg · Accepted Answer

had to change the df a bit because of NaN. Replaced with np.nan

import numpy as np
import pandas as pd
raw_data = {'first_name': [np.nan, 'Molly', 'Tina', 'Jake', 'Amy'], 
        'last_name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze'], 
        'age': [42, 52, np.nan, 24, 73], 
        'preTestScore': [4, 24, 31, 33, 3],
        'postTestScore': [25, 94, 57, 62, 70]}

you can do this

dfs = df.stack(dropna = False)
[f'There is a missing value in row {i[0]} for {i[1]}' for i in dfs[dfs.isna()].index]

prints a list

['There is a missing value in row 0 for first_name',
 'There is a missing value in row 2 for age']

Cainã Max Couto-Silva · Answer

As simple as:

np.where(df.isnull())

It returns a tuple with the row indexes, and column indexes with NAs, respectively.

Example:

na_idx = np.where(df.isnull())
for i,j in zip(*na_idx):
    print(f'Row {i} and column {j} ({df.columns[j]}) is NA.')

oskros · Answer

You could do something like the below:

for i, row in df.iterrows():
    nans = row[row.isna()].index
    for n in nans:
        print('row: %s, col: %s' % (i, n))

Print the colname and rowname for values that meet certain condition

Tags:

python

pandas

dataframe

Pajul

3 Answers

piterbarg

Cainã Max Couto-Silva

oskros

Recent Activity

Donate For Us

Print the colname and rowname for values that meet certain condition

Tags:

python

pandas

dataframe

Pajul

3 Answers

piterbarg

Cainã Max Couto-Silva

oskros

Related questions

Recent Activity

Donate For Us