I have searched around and tried to find a solution to what seems to be a simple problem, but have come up with nothing. The problem is to sort a matrix based on its columns, progressively. So, if I have a numpy matrix like:
import numpy as np
X=np.matrix([[0,0,1,2],[0,0,1,1],[0,0,0,4],[0,0,0,3],[0,1,2,5]])
print(X)
[[0 0 1 2]
[0 0 1 1]
[0 0 0 4]
[0 0 0 3]
[0 1 2 5]]
I would like to sort it based on the first column, then the second, the third, and so on, to get a result like:
Xsorted=np.matrix([[0,0,0,3],[0,0,0,4],[0,0,1,1],[0,0,1,2],[0,1,2,5]])
print(Xsorted)
[[0,0,0,3]
[0,0,0,4]
[0,0,1,1]
[0,0,1,2]
[0,1,2,5]]
While I think it is possible to sort a matrix like this by naming the columns and all that, I would prefer to have a method for sorting that doesn't depend so much on how big the matrix is. I am using Python 3.4, if that is important.
Any help would be greatly appreciated!
It's not going to be particularly fast, but you can always convert your rows to tuples, then use Python's sort:
np.matrix(sorted(map(tuple, X.A)))
You can also use np.lexsort, as suggested in this answer to a somewhat related question:
X[np.lexsort(X.T[::-1])]
The lexsort approach appears to be faster, though you should test with your actual data to make sure:
In [20]: X = np.matrix(np.random.randint(10, size=(100,100)))
In [21]: %timeit np.matrix(sorted(map(tuple, X.A)))
100 loops, best of 3: 2.23 ms per loop
In [22]: %timeit X[np.lexsort(X.T[::-1])]
1000 loops, best of 3: 1.22 ms per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With