I have the following rather simple snippet:
def delete_substring_blocks(s, blocks):
'''
s: original input string
blocks: list of indices (start, end) to be deleted
return string `out` where blocks are deleted from s
'''
out = ''
p = 0
for start, end in blocks:
out += s[p:start]
p = end
out += s[p:]
return out
This function takes a string s
and delete all s[start:end]
from s
, where pairs of indices (start, end)
are given in a list blocks
.
Is there a builtin function somewhere that does the same thing?
There is an assumption in my code:
blocks
are sorted by the first index in ascending order (done by list.sort()
in-place)
As for if the blocks can overlap, in my use case I ensure they don't before invoking the function. But for fun we can also assume they do.
My approach transforms blocks
into a set of indices which I call exclude
. After that, loop through the string and exclude those characters whose index is in the exclude
set. I am using set instead of list because it nicely handles the duplicates (in case of overlapping ranges).
exclude
setGiven an unordered, potentially overlapped list of ranges:
blocks = [(5, 7), (2, 4), (6, 10)]
I want to convert this into:
exclude = {2, 3, 5, 6, 7, 8, 9}
How:
exclude = set()
for block in blocks:
exclude.update(range(*block))
Here is my code and a little example at the end. Note that I chose to rename the function since this function is generic enough to work with string, list, tuple, and other iterable objects, not just strings. Also, because the function returns a list, when dealing with string, we need to join the list of characters back together.
def delete_blocks(iterable, blocks):
exclude = set()
for block in blocks:
exclude.update(range(*block))
return [cell for i, cell in enumerate(iterable) if i not in exclude]
# Try it out
test_string = '0123456789abc'
blocks = [(5, 7), (2, 4), (6, 10)]
result = ''.join(delete_blocks(test_string, blocks))
print('Before: {!r}'.format(test_string))
print('Blocks:', blocks)
print('After: {!r}'.format(result))
delete_substring_blocks
To really answer Mai's question, I implemented delete_substring_blocks
using delete_blocks
:
def delete_substring_blocks(s, blocks):
return ''.join(delete_blocks(s, blocks))
Since it is unspecified we must assume that the list of blocks may contain overlaps.
A relatively inefficient expression, but one that will handle overlapping and non-sorted blocks, is:
def delete_substring_blocks(s, blocks):
return ''.join(
[c for i, c in enumerate(s)
if not any(blk for blk in blocks
if i >= blk[0] and i < blk[1])])
Here we simply test each character's position to see if it is within any of the block intervals and accept it if it is not.
Here is an example with an overlapping block:
>>> delete_substring_blocks(
"hello there how are you",
[[0, 3], [7, 9], [7, 10]])
'lo te how are you'
As you appear to find this expression unreadable, here it is broken down a little more:
def delete_substring_blocks(s, blocks):
def check_pos(i):
return not any(1 for start, end in blocks
if i >= start and i < end)
return ''.join([c for i, c in enumerate(s)
if check_pos(i)])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With