"No numeric types to aggregate" after groupby and mean

Tags:

python

pandas

I'm dealing with time series and try to write function to calculation monthly average of data. Here are some function for prepare:

import datetime
import numpy as numpy
def date_range_0(start,end):

    dates = [start + datetime.timedelta(days=i) 
            for i in range((end-start).days+1)]
    return numpy.array(dates)
def date_range_1(start,days):
    #days should be an interger

    return date_range_0(start,start+datetime.timedelta(days-1))

x=date_range_1(datetime.datetime(2015, 5, 17),4)

x, the output is a simple time list:

array([datetime.datetime(2015, 5, 17, 0, 0),
   datetime.datetime(2015, 5, 18, 0, 0),
   datetime.datetime(2015, 5, 19, 0, 0),
   datetime.datetime(2015, 5, 20, 0, 0)], dtype=object)

Then I learn groupby function from http://blog.csdn.net/youngbit007/article/details/54288603 I have tried one example in website above and it works fine:

df = pandas.DataFrame({'key1':date_range_1(datetime.datetime(2015, 1, 17),5),
              'key2': [2015001,2015001,2015001,2015001,2015001],
              'data1': 1+0.1*numpy.arange(1,6)
        })
df

gives

   data1    key1    key2
0   1.1 2015-01-17  2015001
1   1.2 2015-01-18  2015001
2   1.3 2015-01-19  2015001
3   1.4 2015-01-20  2015001
4   1.5 2015-01-21  2015001

and

grouped=df['data1'].groupby(df['key2'])
grouped.mean()

gives

key2
2015001    0.2
Name: data1, dtype: float64

Then I try my own example:

datedat=numpy.array([date_range_1(datetime.datetime(2015, 1, 17),5),1+0.1*numpy.arange(1,6)]).T
months = [day.month for day in datedat[:,0]]
years = [day.year for day in datedat[:,0]]
datedatF = 
pandas.DataFrame({'key1':datedat[:,0],'key2':list((numpy.array(years)*1000 +numpy.array(months))),'data1':datedat[:,1]})
datedatF

which generated

   data1    key1    key2
0   1.1 2015-01-17  2015001
1   1.2 2015-01-18  2015001
2   1.3 2015-01-19  2015001
3   1.4 2015-01-20  2015001
4   1.5 2015-01-21  2015001

Note this is exactly the very same table as above! so far so good. Then I run:

grouped2=datedatF['data1'].groupby(datedatF['key2'])
grouped2.mean()

it throw out this:

   ---------------------------------------------------------------------------
DataError                                 Traceback (most recent call last)
<ipython-input-170-f0d2bc225b88> in <module>()
  1 grouped2=datedatF['data1'].groupby(datedatF['key2'])
----> 2 grouped2.mean()

/root/anaconda3/lib/python3.6/site-packages/pandas/core/groupby.py in     mean(self, *args, **kwargs)
   1017         nv.validate_groupby_func('mean', args, kwargs)
   1018         try:
-> 1019             return self._cython_agg_general('mean')
   1020         except GroupByError:
   1021             raise

/root/anaconda3/lib/python3.6/site-packages/pandas/core/groupby.py in     _cython_agg_general(self, how, numeric_only)
    806 
    807         if len(output) == 0:
--> 808             raise DataError('No numeric types to aggregate')
    809 
    810         return self._wrap_aggregated_output(output, names)

DataError: No numeric types to aggregate

ohh..what did I wrong?Why can't I mean the second pandas.DataFrame? It's completely same as the successful example!

962

asked Jan 09 '18 15:01

Harry

2 Answers

You data1 type in your df is object , we need adding pd.to_numeric

datedatF.dtypes
Out[39]: 
data1            object
key1     datetime64[ns]
key2              int64
dtype: object
grouped2=pd.to_numeric(datedatF['data1']).groupby(datedatF['key2'])
grouped2.mean()
Out[41]: 
key2
2015001    1.3
Name: data1, dtype: float64

155

answered Oct 02 '22 05:10

BENY

your data1 is of object (string) dtype:

In [396]: datedatF.dtypes
Out[396]:
data1            object   # <--- NOTE!
key1     datetime64[ns]
key2              int64
dtype: object

so try this:

In [397]: datedatF.assign(data1=pd.to_numeric(datedatF['data1'], errors='coerce')) \
                  .groupby('key2')['data1'].mean()
Out[397]:
key2
2015001    1.3
Name: data1, dtype: float64

answered Oct 02 '22 05:10

MaxU - stop WAR against UA

Related questions
                            
                                Pandas Assign Lambda Function
                            
                                Pandas pivot_table preserve order
                            
                                Python packages (numpy/pandas/etc) in Visual Studio 2017 on Windows
                            
                                Check for None in pandas dataframe
                            
                                How to get pretty printed JSON with Keras model.to_json()?
                            
                                Parse a xml file with multiple root element in python
                            
                                How do you add a non-editable field to a custom admin form in Django
                            
                                Parse text response from http request in Python
                            
                                How do I create an ethereum wallet in pure python?
                            
                                GeoDjango: PostgreSQL not running migrations, object has no attribute 'geo_db_type
                            
                                sklearn-LinearRegression: could not convert string to float: '--'
                            
                                MongoClient opened before fork. Create MongoClient
                            
                                How to integrate checking of readme in pytest
                            
                                How to solve a system of linear equations over the nonnegative integers?
                            
                                How to improve NLTK sentence segmentation?
                            
                                Difference between Flask abort() or returning a status
                            
                                Write opencv frames into gstreamer rtsp server pipeline
                            
                                Tensorflow: Load data in multiple threads on cpu
                            
                                Python 3.7 and above: how to determine Linux distribution?
                            
                                Reading a file until a specific character in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With