Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python -- rank tuples by first item, resolve ties by second item

Tags:

python

ranking

I have the following list of tuples:

[(1, 6), (2, 3), (2, 5), (2, 2), (1, 7), (3, 2), (2, 2)]

I would like to rank this list by the first value in the tuple and resolve ties by the second value, so that the output looks like this:

[1, 5, 6, 3, 2, 7, 3]

I couldn't think of a simple way of doing this, so I was looking for something like the scipy.stats.rankdata function. However, for my use-case it's missing something like the order argument in numpy.argsort. I feel like I'm missing something obvious here, in which case I apologise for not googling my answer better!

EDIT:

To explain better what I am trying to achieve:

Given a list of tuples

>>> l = [(1, 6), (2, 3), (2, 5), (2, 2), (1, 7), (3, 2), (2, 2)]

I want to create a list containing the rank of the elements of the list l. For example, ranking by the first value in each tuple:

>>> from scipy import stats
>>> stats.rankdata([i for i, j in l], method='min')
array([ 1.,  3.,  3.,  3.,  1.,  7.,  3.])

This is almost what I wanted, however there are ties in the list (there's two times 1. and four times 3.).

I would like to break the ties using the second value in each tuple, so that for example the two tuples (2, 2) will have the same rank, but the (2, 3) and (2, 5) will have a different rank. The resulting list should look like this:

array([ 1.,  5.,  6.,  3.,  2.,  7.,  3.])
like image 970
robodasha Avatar asked Dec 12 '25 02:12

robodasha


2 Answers

Python sorts sequences naturally.

>>> [x for x, y in sorted(enumerate([(1, 6), (2, 3), (2, 5), (2, 2), (1, 7), (3, 2), (2, 2)], start=1), key=operator.itemgetter(1))]
[1, 5, 4, 7, 2, 3, 6]
like image 114
Ignacio Vazquez-Abrams Avatar answered Dec 16 '25 00:12

Ignacio Vazquez-Abrams


Thanks to Ignacio Vazquez-Abrams' answer I managed to find a solution! It's perhaps not the most efficient way to do this, but it works.

>>> import operator
>>> from scipy import stats
>>> l = [(1, 6), (2, 3), (2, 5), (2, 2), (1, 7), (3, 2), (2, 2)]
>>> uniq = list(set(t for t in l))
>>> s = sorted(uniq)
>>> r = [s.index(i) for i in l]
>>> rank = stats.rankdata(r, method='min')
>>> rank
array([ 1.,  5.,  6.,  3.,  2.,  7.,  3.])
like image 31
robodasha Avatar answered Dec 16 '25 02:12

robodasha



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!