Pandas: calculating the mean values of duplicate entries in a dataframe

Tags:

I have been working with a dataframe in python and pandas that contains duplicate entries in the first column. The dataframe looks something like this:

    sample_id    qual    percent 0   sample_1      10        20 1   sample_2      20        30 2   sample_1      50        60 3   sample_2      10        90 4   sample_3      100       20

I want to write something that identifies duplicate entries within the first column and calculates the mean values of the subsequent columns. An ideal output would be something similar to the following:

    sample_id    qual    percent 0   sample_1      30        40 1   sample_2      15        60 2   sample_3      100       20

I have been struggling with this problem all afternoon and would appreciate any help.

942

asked Oct 07 '16 14:10

David Ross

1 Answers

groupby the sample_id column and use mean

df.groupby('sample_id').mean().reset_index()
or
df.groupby('sample_id', as_index=False).mean()

get you

enter image description here

answered Sep 21 '22 09:09

piRSquared

Related questions
                            
                                How to add new value in collection laravel?
                            
                                GCM unregister causing the application to crash
                            
                                Using Vue with django
                            
                                Can't create test file lower test start server mysql
                            
                                React Warning: flattenChildren(...): Encountered two children with the same key
                            
                                Differences in the initialization of the EAX register when calling a function in C and C++
                            
                                How to set radio button checked in Angular 2
                            
                                Spark SQL window function with complex condition
                            
                                Set style using pure JavaScript [duplicate]
                            
                                Got user-level KeeperException when processing
                            
                                iOS RTMP streaming library - LFLiveKit vs VideoCore lib vs alternative [closed]
                            
                                Laravel Faker - What's the difference between create and make

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With