Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I "zip sort" parallel numpy arrays?

If I have two parallel lists and want to sort them by the order of the elements in the first, it's very easy:

>>> a = [2, 3, 1] >>> b = [4, 6, 7] >>> a, b = zip(*sorted(zip(a,b))) >>> print a (1, 2, 3) >>> print b (7, 4, 6) 

How can I do the same using numpy arrays without unpacking them into conventional Python lists?

like image 452
10 revs, 7 users 53% Avatar asked Dec 14 '09 21:12

10 revs, 7 users 53%


People also ask

Can you zip two NumPy arrays?

The numpy. column_stack() function is another method that can be used to zip two 1D arrays into a single 2D array in Python.

How do I sort multiple arrays at the same time?

The array_multisort( ) function can sort several arrays at once or a multidimensional array by one or more dimensions. The arrays are treated as columns of a table to be sorted by rows.

How do you sort a multidimensional NumPy array?

Sorting 2D Numpy Array by column at index 1 Select the column at index 1 from 2D numpy array i.e. It returns the values at 2nd column i.e. column at index position 1 i.e. Now get the array of indices that sort this column i.e. It returns the index positions that can sort the above column i.e.


2 Answers

b[a.argsort()] should do the trick.

Here's how it works. First you need to find a permutation that sorts a. argsort is a method that computes this:

>>> a = numpy.array([2, 3, 1]) >>> p = a.argsort() >>> p [2, 0, 1] 

You can easily check that this is right:

>>> a[p] array([1, 2, 3]) 

Now apply the same permutation to b.

>>> b = numpy.array([4, 6, 7]) >>> b[p] array([7, 4, 6]) 
like image 121
Jason Orendorff Avatar answered Sep 23 '22 11:09

Jason Orendorff


Here's an approach that creates no intermediate Python lists, though it does require a NumPy "record array" to use for the sorting. If your two input arrays are actually related (like columns in a spreadsheet) then this might open up an advantageous way of dealing with your data in general, rather than keeping two distinct arrays around all the time, in which case you'd already have a record array and your original problem would be answered merely by calling sort() on your array.

This does an in-place sort after packing both arrays into a record array:

>>> from numpy import array, rec >>> a = array([2, 3, 1]) >>> b = array([4, 6, 7]) >>> c = rec.fromarrays([a, b]) >>> c.sort() >>> c.f1   # fromarrays adds field names beginning with f0 automatically array([7, 4, 6]) 

Edited to use rec.fromarrays() for simplicity, skip redundant dtype, use default sort key, use default field names instead of specifying (based on this example).

like image 27
Peter Hansen Avatar answered Sep 22 '22 11:09

Peter Hansen