Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Given a list of slices, how do I split a sequence by them?

Tags:

python

Given a list of slices, how do I separate a sequence based on them?

I have long amino-acid strings that I would like to split based on start-stop values in a list. An example is probably the most clear way of explaining it:

str = "MSEPAGDVRQNPCGSKAC"
split_points = [[1,3], [7,10], [12,13]]

output >> ['M', '(SEP)', 'AGD', '(VRQN)', 'P', '(CG)', 'SKAC']

The extra parentheses are to show which elements were selected from the split_points list. I don't expect the start-stop points to ever overlap.

I have a bunch of ideas that would work, but seem terribly inefficient (code-length wise), and it seems like there must be a nice pythonic way of doing this.

like image 247
latentflip Avatar asked Nov 12 '09 19:11

latentflip


People also ask

How do you break a list unequal size chunks in Python?

Use a for loop to divide each element in a list. Use a for loop to iterate through each element in the list. Use the division operator / to divide by a number. Append the resultant quotients to a new list.


1 Answers

Strange way to split strings you have there:

def splitter( s, points ):
    c = 0
    for x,y in points:
        yield s[c:x]
        yield "(%s)" % s[x:y+1]
        c=y+1
    yield s[c:]

print list(splitter(str, split_points))
# => ['M', '(SEP)', 'AGD', '(VRQN)', 'P', '(CG)', 'SKAC']

# if some start and endpoints are the same remove empty strings.
print list(x for x in splitter(str, split_points) if x != '')
like image 149
Jochen Ritzel Avatar answered Nov 03 '22 03:11

Jochen Ritzel