Let's say that I have some list:
lst = [[2,6],[1,4],[0,1],[1,1],[2,3],[0,2]]
I want to sort lst by the first element and for each sublist keep the one with the maximal second element when grouped by the first element.
So the results will be:
results
>>> [[0,2],[1,4],[2,6]]
Can someone kindly help me?
You can do it using np.maximum.reduceat
:
import numpy as np
lst = np.array([[2,6],[1,4],[0,1],[1,1],[2,3],[0,2]])
lst = lst[np.argsort(lst[:,0])] #sorting lst by first row
u, idx = np.unique(lst[:,0], return_index = True)
print(np.c_[u, np.maximum.reduceat(lst[:,1], idx)])
At first array should be sorted. Then you need to get indices that splits array into groups: idx = [0, 2, 4]
and corresponding values of first column u = [0, 1, 2]
. Finally, use np.maximum.reduceat
in order to get maximum values of groups that starts at indices idx
specified and display it concatenated rightwise to u
.
Remark: I used numpy
here, a widely used library that allows to push looping into C level which is much faster. Purely pythonic solutions are worth attention too.
Bonus: This is actually a one liner using a numpy_indexed
library (not so widely used) dedicated for groupby operations of arrays:
import numpy_indexed as npi
import numpy as np
np.transpose(npi.group_by(lst[:, 0]).max(lst[:, 1]))
Assuming you just have 'pairs' like this (e.g. always 2 ints per sublist with the same 1st value and a 2nd value), it's very simple:
>>> lst = [[2,6],[1,4],[0,1],[1,1],[2,3],[0,2]]
>>> sorted(lst)[1::2]
[[0, 2], [1, 4], [2, 6]]
Sorting the list by default sorts on the 1st and then 2nd value of each sublist, then just slice the resulting list to take every other item
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With