Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Given a "jumbled" list L, get a list where each element is the index of the corresponding element of L, if L was sorted

Tags:

python

Desired Output

I want a function to return a list such that, given a "jumbled" list l, each element is the index of the corresponding element of l, if l was sorted. (I'm failing to think of a less convoluted way of saying this, sorry.)

Examples

f([3,1,2]) = [2,0,1]

f([3,1,2,2,3]) = [3,0,1,2,4], since the input sorted is [1,2,2,3,3].

(This is useful for some stats calculations.)

My Attempt

I came up with a way to do this function, but this is python- it seems like there should be a one-liner to do this, or at least a much cleaner, clearer way.

def getIndiciesInSorted(l):
    sortedL = sorted(l)
    outputList = []
    for num in l:
        sortedIndex = sortedL.index(num)
        outputList.append(sortedIndex)
        sortedL[sortedIndex] = None
    return outputList

l=[3,1,2,2,3] 
print getIndiciesInSorted(l)

So, how can I write this more concisely? Is there a legible list comprehension solution?

like image 283
Matthew Adams Avatar asked Dec 06 '22 12:12

Matthew Adams


1 Answers

def argsort(seq):
    # http://stackoverflow.com/questions/3382352/3382369#3382369
    # http://stackoverflow.com/questions/3071415/3071441#3071441
    '''
    >>> seq=[1,3,0,4,2]
    >>> index=argsort(seq)
    [2, 0, 4, 1, 3]

    Given seq and the index, you can construct the sorted seq:
    >>> sorted_seq=[seq[x] for x in index]
    >>> assert sorted_seq == sorted(seq)

    Given the sorted seq and the index, you can reconstruct seq:
    >>> assert [sorted_seq[x] for x in argsort(index)] == seq
    '''
    return sorted(range(len(seq)), key=seq.__getitem__)

def f(seq):
    idx = argsort(seq)
    return argsort(idx)

print(f([3,1,2]))
# [2, 0, 1]

print(f([3,1,2,2,3]))
# [3, 0, 1, 2, 4]

Note that nightcracker's function is faster:

def get_sorted_indices(l):
    sorted_positions = sorted(range(len(l)), key=l.__getitem__)
    result = [None for _ in range(len(l))]
    for new_index, old_index in enumerate(sorted_positions):
        result[old_index] = new_index
    return result

The difference may be significant for long lists:

In [83]: import random
In [98]: l = [random.randrange(100) for _ in range(10000)]
In [104]: timeit get_sorted_indices(l)
100 loops, best of 3: 4.73 ms per loop

In [105]: timeit f(l)
100 loops, best of 3: 6.64 ms per loop
like image 131
unutbu Avatar answered Dec 11 '22 09:12

unutbu