Python: mean() doesn't work when groupby aggregates dataframe to one line

Tags:

I have dataframe:

time_to_rent = {'rentId': {0: 43.0, 1: 87.0, 2: 140.0, 3: 454.0, 4: 1458.0}, 'creditCardId': {0: 40, 1: 40, 2: 40, 3: 40, 4: 40}, 'createdAt': {0: Timestamp('2020-08-24 16:13:11.850216'), 1: Timestamp('2020-09-10 10:47:31.748628'), 2: Timestamp('2020-09-13 15:29:06.077622'), 3: Timestamp('2020-09-24 08:08:39.852348'), 4: Timestamp('2020-10-19 08:54:09.891518')}, 'updatedAt': {0: Timestamp('2020-08-24 20:26:31.805939'), 1: Timestamp('2020-09-10 20:05:18.759421'), 2: Timestamp('2020-09-13 18:38:10.044112'), 3: Timestamp('2020-09-24 08:53:22.512533'), 4: Timestamp('2020-10-19 17:10:09.110038')}, 'rent_time': {0: Timedelta('0 days 04:13:19.955723'), 1: Timedelta('0 days 09:17:47.010793'), 2: Timedelta('0 days 03:09:03.966490'), 3: Timedelta('0 days 00:44:42.660185'), 4: Timedelta('0 days 08:15:59.218520')}}

The idea to aggregate dataframe by column 'creditCardId' and have mean value of 'rent_time'. Ideal output should be:

creditCardId        rent_time mean
40                  0 days 05:08:10.562342

if I run code:

print (time_to_rent['rent_time'].mean())

it works fine and i have "0 days 05:08:10.562342" as output. But when i am trying to get grouping by:

time_to_rent.groupby('creditCardId', as_index=False)[['rent_time']].mean()

I got error back:

~\anaconda3\lib\site-packages\pandas\core\groupby\generic.py in _cython_agg_blocks(self, how, alt, numeric_only, min_count)
   1093 
   1094         if not (agg_blocks or split_frames):
-> 1095             raise DataError("No numeric types to aggregate")
   1096 
   1097         if split_items:

DataError: No numeric types to aggregate

if I use the command:

time_to_rent = time_to_rent.groupby('creditCardId', as_index=False)[['rent_time']]

it returns only "<pandas.core.groupby.generic.DataFrameGroupBy object at 0x000000000B5F2EE0>"

May you please help me understand where my mistake is?

807

asked Oct 20 '20 15:10

OcMaRUS

1 Answers

It's not your mistake, possibly a bug in Pandas since Timedelta can be averaged. A work-around is apply:

time_to_rent.groupby('creditCardId')['rent_time'].apply(lambda x: x.mean())

Output:

creditCardId
40   0 days 05:08:10.562342200
Name: rent_time, dtype: timedelta64[ns]

answered Oct 08 '22 09:10

Quang Hoang

Related questions
                            
                                Ubuntu 20.04 "Temporary failure in name resolution" - recently reinstalled
                            
                                GSDMM Convergence of Clusters (Short Text Clustering)
                            
                                Python logging why outputing twice?
                            
                                pandas dataframe column based on previous rows
                            
                                How to run a Django project with .pyc files without using source codes?
                            
                                How to display hover info on a plotly Table?
                            
                                Problem with creating an environment from .yml file, error "CondaEnvException: Pip failed" raised
                            
                                NumPy: Alternative to `vectorize` that lets me access the array
                            
                                unable to access environment variables from docker compose env file
                            
                                Unit Test Retry functionality provided by Python
                            
                                How to create a vertical scroll bar with Plotly?
                            
                                Repeat value every 4 rows and use the beginning rows to fill the rest
                            
                                Why is the scientific formatting of Decimal(0) different from float 0?
                            
                                Adding items to Wishlist | Django
                            
                                How to annotate difference between bars?
                            
                                How to convert a function in a third party library to be async?
                            
                                Python macOS builds run from Terminal but crash on Finder launch
                            
                                Training MSE loss larger than theoretical maximum?
                            
                                Why does client.recv(1024) return an empty byte literal in this bare-bones WebSocket Server implementation?
                            
                                How to generate swagger documentation for aws-lambda python API?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python: mean() doesn't work when groupby aggregates dataframe to one line

Tags:

python

pandas

grouping

mean

OcMaRUS

People also ask

1 Answers

Quang Hoang

Recent Activity

Donate For Us