Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort a subset of a python list to have the same relative order as in other list

So having a list say b = [b1, b2, b3] I want to be able to sort a list a in such a way that all bi's that also exist in a have the same relative order as in b - leaving the rest of a's elements alone. So

a = [ b1, x, b3, y, b2] -> [ b1, x, b2, y, b3]
a = [ b1, x, b2, y, b3] -> no change
a = [ b1, x, y, b2]     -> no change
a = [ b3, x, b1, y, b2] -> [ b1, x, b2, y, b3]

b of course may be a tuple or any other ordered structure. What I came up with

bslots = dict((x, a.index(x)) for x in a if x in b)
bslotsSorted = sorted(bslots.keys(), key=lambda y: b.index(y))
indexes = sorted(bslots.values())
for x,y in zip(bslotsSorted, indexes):
  a[y] = x

is clumsy and O(n^2)

like image 713
Mr_and_Mrs_D Avatar asked May 28 '15 10:05

Mr_and_Mrs_D


People also ask

How do you sort a list by order another list in Python?

sort() is one of Python's list methods for sorting and changing a list. It sorts list elements in either ascending or descending order. sort() accepts two optional parameters. reverse is the first optional parameter.

How do you sort one list with respect to another in Python?

Approach : Zip the two lists. Create a new, sorted list based on the zip using sorted(). Using a list comprehension extract the first elements of each pair from the sorted, zipped list.

How do you sort a subset in Python?

You can sort rows using the sort_values method. You have to pass in the column name that you want to sort by your dataframe. For example if we apply the sort_values method on the weight_kg column name, we will get them sorted by their weights. We will get the lightest person on top, heaviest at the bottom.

What is the difference sort () and sorted () in Python?

The main difference between sort and sorted in Python is that sort function returns nothing and makes changes to the original sequence, while the sorted () function creates a new sequence type containing a sorted version of the given sequence.


1 Answers

  • First create a dictionary using items from b where the key is the item and value is its index, we will use this to sort the matched items in a later on.

  • Now filter out item from a that are present in that dict, dict provides O(1) lookup.

  • Now sort this list of filtered items and convert it to an iterator.

  • Now loop over a again and for each item check if is present in dict then fetch its value from iterator otherwise use it as is.

def solve(a, b):
    dct = {x: i for i, x in enumerate(b)}
    items_in_a = [x for x in a if x in dct]
    items_in_a.sort(key=dct.get)
    it = iter(items_in_a)
    return [next(it) if x in dct else x for x in a]
...
>>> b = ['b1', 'b2', 'b3']
>>> a = [ 'b1', 'x', 'b3', 'y', 'b2']
>>> solve(a, b)
['b1', 'x', 'b2', 'y', 'b3']
>>> a = [ 'b1', 'x', 'b2', 'y', 'b3']
>>> solve(a, b)
['b1', 'x', 'b2', 'y', 'b3']
>>> a = [ 'b1', 'x', 'y', 'b2']
>>> solve(a, b)
['b1', 'x', 'y', 'b2']
>>> a = [ 'b3', 'x', 'b1', 'y', 'b2']
>>> solve(a, b)
['b1', 'x', 'b2', 'y', 'b3']

Overall time complexity is going to be max of (O(len(a)), O(len(b)), O(items_in_a_length log items_in_a_length).

like image 113
Ashwini Chaudhary Avatar answered Nov 14 '22 22:11

Ashwini Chaudhary