Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the sorting logic behind np.lexsort?

Tags:

python

numpy

How does this function work?

import numpy as np
first_names = (5,5,5)
last_names = (3,1,2)
x = np.lexsort((first_names, last_names))
print(x)

It gives output [1 2 0] . I assume that the two lists are sorted by the variable last_names. If so, how can number 2 have index 0. 2 is between 1 and 3, so I don't understand how this sorting works. please explain.

like image 286
George Avatar asked Sep 02 '25 09:09

George


1 Answers

Essentially, np.lexsort((first_names, last_names)) says : sort by last_name first, then sort by first_name

Reading the documentation, and particularly the example located under "Sort two columns of numbers:", reveals a lot. Essentially, you are first sorting by last_name, which reorders that so that index 1 (whose value is 1) is first, index 2 (whose value is 2) is second, and index 0 (whose value is 3) is third. With this order, the sorted last_name ends up as (1,2,3), i.e. it is sorted. Then, if there were any ties, the corresponding indices in first_name would be the tie breaker.

For example, consider this case:

first_names = (5,5,4)
last_names = (3,1,1)

There is a tie between index 1 and 2 in last_name (they both have the value 1), which will be broken by their corresponding indices in first_name. At indices 1 and 2 of first_name, index 2 (value 4) is lower than index 1 (value 5), so it will come first. So, the resulting lexsort will be [2,1,0]:

np.lexsort((first_names, last_names))
# array([2, 1, 0])
like image 164
sacuL Avatar answered Sep 04 '25 22:09

sacuL