Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregate DataFrame over Index [duplicate]

I have following DataFrame

                    (polygon object)     ASSAULT     BURGLARY   bank     cafe    crossing
INCIDENTDATE                                                                            
2009-01-01 02:00:00                A           1           0       0        1           0
2009-01-01 02:00:00                A           1           0       0        1           0
2009-01-01 02:00:00                A           1           0       1        0           0
2009-01-01 02:00:00                A           1           0       0        0           1
2009-01-01 02:00:00                A           1           0       0        1           0
2009-01-04 11:00:00                B           0           1       1        0           0
2009-01-04 11:00:00                B           0           1       1        0           0
2009-01-04 11:00:00                B           0           1       0        0           0
2009-01-04 11:00:00                B           0           1       1        0           0
2009-01-04 11:00:00                B           0           1       0        1           0

I want to aggregate that DataFrame to only have unique 'INCIDENTDATE'

while doing this I want the value of each column (except polygon) to be 1 if it was 1 in at least one row of same 'INCIDENTDATE' rows.

The final DataFrame should look like this:

                    (polygon object)    ASSAULT     BURGLARY    bank     cafe    crossing
INCIDENTDATE                                                                            
2009-01-01 02:00:00                A           1           0       1        1           1
2009-01-04 11:00:00                B           0           1       1        1           0

How would i achieve that in pandas? Googling my question pointed me to the groupby() function but I really dont understand how i would use it here.

like image 814
Charles David Mupende Avatar asked May 19 '26 07:05

Charles David Mupende


1 Answers

I think just reset in the index then groupby that new column and look for the max values of each group:

df.reset_index(inplace=True)
df.groupby('INCIDENTDATE').max()
like image 122
it's-yer-boy-chet Avatar answered May 22 '26 00:05

it's-yer-boy-chet



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!