I have a dataframe(edata) as given below
Domestic   Catsize    Type   Count
   1          0         1      1
   1          1         1      8
   1          0         2      11
   0          1         3      14
   1          1         4      21
   0          1         4      31
From this dataframe I want to calculate the sum of all counts where the logical AND of both variables (Domestic and Catsize) results in Zero (0) such that
1   0    0
0   1    0
0   0    0
The code I use to perform the process is
g=edata.groupby('Type')
q3=g.apply(lambda x:x[((x['Domestic']==0) & (x['Catsize']==0) |
                       (x['Domestic']==0) & (x['Catsize']==1) |
                       (x['Domestic']==1) & (x['Catsize']==0)
                       )]
            ['Count'].sum()
           )
q3
Type
1     1
2    11
3    14
4    31
This code works fine, however, if the number of variables in the dataframe increases then the number of conditions grows rapidly. So, is there a smart way to write a condition that states that if the ANDing the two (or more) variables result in a zero then perform the sum() function
at is a single element and using . loc maybe a Series or a DataFrame. Returning single value is not the case always. It returns array of values if the provided index is used multiple times.
By use + operator simply you can combine/merge two or multiple text/string columns in pandas DataFrame. Note that when you apply + operator on numeric columns it actually does addition instead of concatenation.
You can filter first using pd.DataFrame.all negated:
cols = ['Domestic', 'Catsize']
res = df[~df[cols].all(1)].groupby('Type')['Count'].sum()
print(res)
# Type
# 1     1
# 2    11
# 3    14
# 4    31
# Name: Count, dtype: int64
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With