Pandas count null values in a groupby function

Tags:

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'],                'B' : ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'],                'C' : [np.nan, 'bla2', np.nan, 'bla3', np.nan, np.nan, np.nan, np.nan]})

Output:

     A      B     C 0  foo    one   NaN 1  bar    one  bla2 2  foo    two   NaN 3  bar  three  bla3 4  foo    two   NaN 5  bar    two   NaN 6  foo    one   NaN 7  foo  three   NaN

I would like to use groupby in order to count the number of NaN's for the different combinations of foo.

Expected Output (EDIT):

     A      B     C    D 0  foo    one   NaN    2 1  bar    one  bla2    0 2  foo    two   NaN    2 3  bar  three  bla3    0 4  foo    two   NaN    2 5  bar    two   NaN    1 6  foo    one   NaN    2 7  foo  three   NaN    1

Currently I am trying this:

df['count']=df.groupby(['A'])['B'].isnull().transform('sum')

But this is not working...

Thank You

391

asked Apr 10 '17 11:04

Stefan

1 Answers

I think you need groupby with sum of NaN values:

df2 = df.C.isnull().groupby([df['A'],df['B']]).sum().astype(int).reset_index(name='count') print(df2)      A      B  count 0  bar    one      0 1  bar  three      0 2  bar    two      1 3  foo    one      2 4  foo  three      1 5  foo    two      2

If need filter first add boolean indexing:

df = df[df['A'] == 'foo'] df2 = df.C.isnull().groupby([df['A'],df['B']]).sum().astype(int) print(df2) A    B     foo  one      2      three    1      two      2

Or simpler:

df = df[df['A'] == 'foo'] df2 = df['B'].value_counts() print(df2) one      2 two      2 three    1 Name: B, dtype: int64

EDIT: Solution is very similar, only add transform:

df['D'] = df.C.isnull().groupby([df['A'],df['B']]).transform('sum').astype(int) print(df)      A      B     C  D 0  foo    one   NaN  2 1  bar    one  bla2  0 2  foo    two   NaN  2 3  bar  three  bla3  0 4  foo    two   NaN  2 5  bar    two   NaN  1 6  foo    one   NaN  2 7  foo  three   NaN  1

jezrael

Related questions
                            
                                Command line connection string for EF core database update
                            
                                Redux form defaultValue
                            
                                google mock - can I call EXPECT_CALL multiple times on same mock object?
                            
                                Launching Explorer from WSL
                            
                                Why does abstract class have to implement all methods from interface?
                            
                                subprocess "TypeError: a bytes-like object is required, not 'str'"
                            
                                Vue-Router Passing Data with Props
                            
                                How to get mouse coordinates in VueJS
                            
                                What is the difference between Int and Integer in Kotlin?
                            
                                Visual Studio Code: How to Disable Drag to Edit Text?
                            
                                How to sort pandas dataframe by custom order on string index
                            
                                What is Elastic IP in AWS and why it is useful?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With