I'm trying to extract lists/sublists from one bigger integer-list with Python2.7 by using start- and end-patterns. I would like to do it with a function, but I cant find a library, algorithm or a regular expression for solving this problem.
def myFunctionForSublists(data, startSequence, endSequence):
# ... todo
data = [99, 99, 1, 2, 3, 99, 99, 99, 4, 5, 6, 99, 99, 1, 2, 3, 99, 4, 5, 6, 99]
startSequence = [1,2,3]
endSequence = [4,5,6]
sublists = myFunctionForSublists(data, startSequence, endSequence)
print sublists[0] # [1, 2, 3, 99, 99, 99, 4, 5, 6]
print sublists[1] # [1, 2, 3, 99, 4, 5, 6]
Any ideas how I can realize it?
Here's a more general solution that doesn't require the lists being sliceable, so you can use it on other iterables, like generators.
We keep a deque
the size of the start
sequence until we come across it. Then we add those values to a list, and keep iterating over the sequence. As we do, we keep a deque
the size of the end sequence, until we see it, also adding the elements to the list we're keeping. If we come across the end sequence, we yield
that list and set the deque
up to scan for the next start sequence.
from collections import deque
def gen(l, start, stop):
start_deque = deque(start)
end_deque = deque(stop)
curr_deque = deque(maxlen=len(start))
it = iter(l)
for c in it:
curr_deque.append(c)
if curr_deque == start_deque:
potential = list(curr_deque)
curr_deque = deque(maxlen=len(stop))
for c in it:
potential.append(c)
curr_deque.append(c)
if curr_deque == end_deque:
yield potential
curr_deque = deque(maxlen=len(start))
break
print(list(gen([99, 99, 1, 2, 3, 99, 99, 99, 4, 5, 6, 99, 99, 1, 2, 3, 99, 4, 5, 6, 99], [1,2,3], [4,5,6])))
# [[1, 2, 3, 99, 99, 99, 4, 5, 6], [1, 2, 3, 99, 4, 5, 6]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With