I have a dataframe similar to
df = pd.DataFrame({'A': [1, np.nan,2,3, np.nan,4], 'B': [np.nan, 1,np.nan,2, 3, np.nan]})
df
A B
0 1.0 NaN
1 NaN 1.0
2 2.0 NaN
3 3.0 2.0
4 NaN 3.0
5 4.0 NaN
How do I count the number of occurrences of A is np.nan
but B not np.nan
, A not np.nan
but B is np.nan
, and A and B both not np.nan
?
I tried df.groupby(['A', 'B']).count()
but it doesn't read the rows with np.nan
.
Using
df.isnull().groupby(['A','B']).size()
Out[541]:
A B
False False 1
True 3
True False 2
dtype: int64
You can use DataFrame.isna
with crosstab
for count Trues values:
df1 = df.isna()
df2 = pd.crosstab(df1.A, df1.B)
print (df2)
B False True
A
False 1 3
True 2 0
For scalar:
print (df2.loc[False, False])
1
df2 = pd.crosstab(df1.A, df1.B).add_prefix('B_').rename(lambda x: 'A_' + str(x))
print (df2)
B B_False B_True
A
A_False 1 3
A_True 2 0
Then for scalar use indexing:
print (df2.loc['A_False', 'B_False'])
1
Another solution is use DataFrame.dot
by columns names with Series.replace
and Series.value_counts
:
df = pd.DataFrame({'A': [1, np.nan,2,3, np.nan,4, np.nan],
'B': [np.nan, 1,np.nan,2, 3, np.nan, np.nan]})
s = df.isna().dot(df.columns).replace({'':'no match'}).value_counts()
print (s)
B 3
A 2
no match 1
AB 1
dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With