Group by max or min in a numpy array

Tags:

I have two equal-length 1D numpy arrays, id and data, where id is a sequence of repeating, ordered integers that define sub-windows on data. For example:

I would like to aggregate data by grouping on id and taking either the max or the min.

In SQL, this would be a typical aggregation query like SELECT MAX(data) FROM tablename GROUP BY id ORDER BY id.

Is there a way I can avoid Python loops and do this in a vectorized manner?

216

asked Dec 24 '11 06:12

Abiel

2 Answers

with only numpy and without loops:

id = np.asarray([1,1,1,2,2,2,3,3])
data = np.asarray([2,7,3,8,9,10,1,-10])

# max
_ndx = np.argsort(id)
_id, _pos  = np.unique(id[_ndx], return_index=True)
g_max = np.maximum.reduceat(data[_ndx], _pos)

# min
_ndx = np.argsort(id)
_id, _pos  = np.unique(id[_ndx], return_index=True)
g_min = np.minimum.reduceat(data[_ndx], _pos)

# compare results with pandas groupby
np_group = pd.DataFrame({'min':g_min, 'max':g_max}, index=_id)
pd_group = pd.DataFrame({'id':id, 'data':data}).groupby('id').agg(['min','max'])

(pd_group.values == np_group.values).all()  # TRUE

answered Nov 08 '22 00:11

Marco Cerliani

Ive packaged a version of my previous answer in the numpy_indexed package; its nice to have this all wrapped up and tested in a neat interface; plus it has a lot more functionality as well:

import numpy_indexed as npi
group_id, group_max_data = npi.group_by(id).max(data)

And so on

answered Nov 07 '22 23:11

Eelco Hoogendoorn

Related questions
                            
                                How to check the type of a many-to-many-field in django?
                            
                                best way to convert the this html file into an xml file using python
                            
                                Why doesn't my TimedRotatingFileHandler rotate at midnight?
                            
                                Is it possible to empty a job queue on a Gearman server
                            
                                Problem with Django-1.3 beta
                            
                                Python + MongoDB - Cursor iteration too slow
                            
                                Django : customizing FileField value while editing a model
                            
                                urllib2 POST progress monitoring
                            
                                Python wait x secs for a key and continue execution if not pressed
                            
                                How do you select choices in a form using Python?
                            
                                Generic methods in python
                            
                                Why doesn't Python have a hybrid getattr + __getitem__ built in?
                            
                                accessing *args from within a function in Python
                            
                                python: convert base64 encoded png image to jpg
                            
                                Using __getattribute__ or __getattr__ to call methods in Python
                            
                                Change Cherrypy Port and restart web server
                            
                                Can I load a multi-frame TIFF through OpenCV?
                            
                                Python Mechanize select form FormNotFoundError
                            
                                How to make menubar cut/copy/paste with Python/Tkinter
                            
                                how to correctly modify the iterator of a loop in python from within the loop

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Group by max or min in a numpy array

Tags:

python

python-3.x

group-by

numpy

Abiel

People also ask

2 Answers

Marco Cerliani

Eelco Hoogendoorn

Recent Activity

Donate For Us