Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

list match in python: get indices of a sub-list in a larger list

For two lists,

a = [1, 2, 9, 3, 8, ...]   (no duplicate values in a, but a is very big)
b = [1, 9, 1,...]          (set(b) is a subset of set(a), 1<<len(b)<<len(a)) 

indices = get_indices_of_a(a, b)

how to let get_indices_of_a return indices = [0, 2, 0,...] with array(a)[indices] = b? Is there a faster method than using a.index, which is taking too long?

Making b a set is a fast method of matching lists and returning indices (see compare two lists in python and return indices of matched values ), but it will lose the index of the second 1 as well as the sequence of the indices in this case.

like image 898
user1342516 Avatar asked Apr 30 '12 14:04

user1342516


People also ask

How do you compare Sublists in Python?

With any(a in s for s in b) you check whether the list a is an element of any of the sublists of b . While x in y will return True if x is a substring of y (or even y itself) if both are strings, the same does not work with lists: here, x has to be an element of y , not a sublist. Save this answer.

How do you get the indices of all occurrences of an element in a list in Python?

One of the most basic ways to get the index positions of all occurrences of an element in a Python list is by using a for loop and the Python enumerate function. The enumerate function is used to iterate over an object and returns both the index and element.

How do you find the index of a matching element in Python?

How to Find the Index of a List Element in Python. You can use the index() method to find the index of the first element that matches with a given search object. The index() method returns the first occurrence of an element in the list. In the above example, it returns 1, as the first occurrence of “Bob” is at index 1.


2 Answers

A fast method (when a is a large list) would be using a dict to map values in a to indices:

>>> index_dict = dict((value, idx) for idx,value in enumerate(a))
>>> [index_dict[x] for x in b]
[0, 2, 0]

This will take linear time in the average case, compared to using a.index which would take quadratic time.

like image 132
interjay Avatar answered Nov 13 '22 07:11

interjay


Presuming we are working with smaller lists, this is as easy as:

>>> a = [1, 2, 9, 3, 8] 
>>> b = [1, 9, 1] 
>>> [a.index(item) for item in b]
[0, 2, 0]

On larger lists, this will become quite expensive.

(If there are duplicates, the first occurrence will always be the one referenced in the resulting list, if not set(b) <= set(a), you will get a ValueError).

like image 35
Gareth Latty Avatar answered Nov 13 '22 07:11

Gareth Latty