I have a dataframe with null/empty values in it.
I can easily get the count for each row of the null values by doing this:
df['NULL_COUNT'] = len(df[fields] - df.count(axis=1)
Which will put the number of columns that are NULL
in the field NULL_COUNT
.
Is there a way to write the column headers the same way to another field if it is null?
df['NULL_FIELD_NAMES'] = "<some query expression>"
Example:
df = pd.DataFrame([range(3), [0, np.NaN, 0], [0, 0, np.NaN], range(3), range(3)], columns=['A', 'B', 'C'])
In the df above, the 2nd row should have df['NULL_FIELD_NAME'] = 'B'
and 3rd row should have df['NULL_FIELD_NAME'] = 'C'
In order to check null values in Pandas DataFrame, we use isnull() function this function return dataframe of Boolean values which are True for NaN values.
shape() method returns the number of rows and number of columns as a tuple, you can use this to check if pandas DataFrame is empty. DataFrame. shape[0] return number of rows. If you have no rows then it gives you 0 and comparing it with 0 gives you True .
If we want to quickly find rows containing empty values in the entire DataFrame, we will use the DataFrame isna() and isnull() methods, chained with the any() method.
You can use:
df['new'] = (df.isnull() * df.columns.to_series()).apply(','.join,axis=1).str.strip(',')
Another solution:
df['new'] = df.apply(lambda x: ','.join(x[x.isnull()].index),axis=1)
Sample:
df = pd.DataFrame([range(3), [np.NaN, np.NaN, 0], [0, 0, np.NaN], range(3), range(3)],
columns=['A', 'B', 'C'])
print (df)
A B C
0 0.0 1.0 2.0
1 NaN NaN 0.0
2 0.0 0.0 NaN
3 0.0 1.0 2.0
4 0.0 1.0 2.0
df['new'] = df.apply(lambda x: ','.join(x[x.isnull()].index),axis=1)
print (df)
A B C new
0 0.0 1.0 2.0
1 NaN NaN 0.0 A,B
2 0.0 0.0 NaN C
3 0.0 1.0 2.0
4 0.0 1.0 2.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With