Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select sublists from python list, beginning and ending on the same element

I have a (very large) list similar to:

a = ['A', 'B', 'A', 'B', 'A', 'C', 'D', 'E', 'D', 'E', 'D', 'F', 'G', 'A', 'B']

and I want to extract from it a list of lists like:

result = [['A', 'B', 'A', 'B', 'A'], ['D', 'E', 'D', 'E', 'D']]

The repeating patterns can be different, for example there can also be intervals such as:

['A', 'B', 'C', 'A', 'D', 'E', 'A'] (with a 'jump' over two elements)

I have written a very simple code that seems to work:

tolerance = 2
counter = 0
start, stop = 0, 0
for idx in range(len(a) - 1):
    if a[idx] == a[idx+1] and counter == 0:
        start = idx
        counter += 1
    elif a[idx] == a[idx+1] and counter != 0:
        if tolerance <= 0: 
            stop = idx
        tolerance = 2
    elif a[idx] != a[idx+1]:
        tolerance -= 1
    if start != 0 and stop != 0:
        result = [a[start::stop]]

But 1) it is very cumbersome, 2) I need to apply this to very large lists, so is there a more concise and faster way of implementing it?

EDIT: As @Kasramvd correctly pointed out, I need the largest set that satisfies the requirement of (at most a tolerance number of jumps between the start and end elements), so I take:

['A', 'B', 'A', 'B', 'A'] instead of [ 'B', 'A', 'B' ]

because the former includes the latter.

Also it would be good if the code can select elements UP TO the certain tolerance, for example if the tolerance (maximum number of elements not equal to the start or end element) is 2, it should also return sets as:

['A', 'A', 'A', 'B', 'A', 'B', 'A', 'C', 'D', 'A']

with tolerances 0, 1 and 2.

like image 318
Qubix Avatar asked Sep 07 '18 10:09

Qubix


People also ask

How to extract first element of each sublist in Python?

Given a list of lists, write a Python program to extract first element of each sublist in the given list of lists. This method uses zip with * or unpacking operator which passes all the items inside the ‘lst’ as arguments to zip function. Thus, all the first element will become the first tuple of the zipped list.

What is sublist in Python?

Python Server Side Programming Programming A list in python can also contain lists inside it as elements. These nested lists are called sublists. In this article we will solve the challenge of retrieving only the first element of each sublist in a given list.

How to get the last element from a list in Python?

We get a list of the first items in each sublist. You can also use list comprehension to reduce to above code to a single line. We get the same result as above. You can similarly use the above methods to get the last element in each sublist. Use the -1 index to access the last element from a list.

How to select elements from a Python list?

To select elements from a Python list, we will use list.append (). We will create a list of indices to be accessed and the loop is used to iterate through this index list to access the specified element.


1 Answers

Solution without any extra copying of lists other than the sublist results:

def sublists(a, tolerance):
    result = []
    index = 0

    while index < len(a):
        curr = a[index]

        for i in range(index, len(a)):
            if a[i] == curr:
                end = i
            elif i - end > tolerance:
                break

        if index != end:
            result.append(a[index:end+1])
        index += end - index + 1

    return result

Usage is simply as follows:

a = ['A', 'B', 'A', 'B', 'A', 'C', 'D', 'E', 'D', 'E', 'D', 'F', 'G', 'A', 'B']

sublists(a, 0)  # []
sublists(a, 1)  # [['A', 'B', 'A', 'B', 'A'], ['D', 'E', 'D', 'E', 'D']]
sublists(a, 2)  # [['A', 'B', 'A', 'B', 'A'], ['D', 'E', 'D', 'E', 'D']]

Possible solution to extra requirement as specified in the comments:

if i > index and a[i] == a[i-1] == curr:
    end = i - 1
    break
elif a[i] == curr:
    end = i
elif i - end > tolerance:
    break

Note: I've not tested this thoroughly.

like image 154
ikkuh Avatar answered Oct 10 '22 21:10

ikkuh