boolean operation with groupby in pandas

Tags:

I would like to use pandas.groupby in a particular way. Given a DataFrame with two boolean columns (call them col1 and col2) and an id column, I want to add a column in the following way:

for every entry, if (col2 is True) and and (col1 is True for any of the entries with the same id) then assign True. Otherwise False.

I have made a simple example:

df = pd.DataFrame([[0,1,1,2,2,3,3],[False, False, False, False, False, False, True],[False, True, False, False, True ,True, False]]).transpose()
df.columns = ['id', 'col1', 'col2']

gives the following DataFrame:

     id   col1   col2
0    0   False   False
1    1   False   True
2    1   False   False
3    2   False   False
4    2   False   True
5    3   False   True
6    3   True    False

According to the above rule, the following should column should be added:

0    False
1    False
2    False
3    False
4    False
5     True
6    False

Any ideas on an elegant way to do this?

882

asked Mar 22 '17 04:03

splinter

2 Answers

df.groupby('id').col1.transform('any') & df.col2

0    False
1    False
2    False
3    False
4    False
5     True
6    False
dtype: bool

178

answered Oct 16 '22 05:10

piRSquared

This code will produce the output you requested:

df2 = df.merge(df.groupby('id')['col1'] # group on "id" and select 'col1'
                    .any()              # True if any items are True
                    .rename('cond2')    # name Series 'cond2'
                    .to_frame()         # make a dataframe for merging
                    .reset_index())     # reset_index to get id column back
print(df2.col2 & df2.cond2)             # True when 'col2' and 'cond2' are True

answered Oct 16 '22 05:10

Craig

Related questions
                            
                                GitPython list all files affected by a certain commit
                            
                                Spider must return Request, BaseItem, dict or None, got 'set'
                            
                                UnicodeDecodeError with pandas.read_sql
                            
                                NLP - information extraction in Python (spaCy)
                            
                                Gaussian Fit on noisy and 'interesting' data set
                            
                                Running multiple services using dev_appserver.py on different ports
                            
                                Incremental model update with PyMC3
                            
                                Append empty rows to Dataframe in pandas
                            
                                Python ftplib.error_perm: 530 Login authentication failed
                            
                                Pandas, filter rows which column contain another column
                            
                                What is the difference between `sys.meta_path` and `sys.path_hooks` importer objects?
                            
                                Google Sheets API Python - Clear sheet
                            
                                rpy2 doesn't work - requires libiconv.so.2
                            
                                Return Pandas dataframe as JSONP response in Python Flask
                            
                                delimiter of tab '\t' of csv.writer in python
                            
                                Python/Pandas: How to Match List of Strings with a DataFrame column
                            
                                Pandas DataFrame Table Vertical Scrollbars
                            
                                Coefficient of Variation and NumPy
                            
                                how to keep numpy array when saving pandas dataframe to csv
                            
                                Comparing two date objects in Python: TypeError: '<' not supported between instances of 'datetime.date' and 'method'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

boolean operation with groupby in pandas

Tags:

python

python-3.x

pandas

dataframe

pandas-groupby

splinter

People also ask

2 Answers

piRSquared

Craig

Recent Activity

Donate For Us