I have a string such as manipulate widgets add,1,2,3 (sorry, but I can't change the format).
I want to delete the first X words and any delimiters which preced them.
Let's take 3 as an example, thus deleting manipulate widgets add and leaving ,1,2,3
Or, take manipulate,widgets,add,1,2,3 delete two words (manipulate,widgets) and leave ,add,1,2,3
I can split the string into a list with words = re.split('[' + delimiters + ']',inputString.strip()) but I can't simply delete the fist X words
with, say,
for i in range(1, numWorsdToRemove):
del words[0]
and then return ' '.join(words) because that gives me 1 2 3 4.
How can I do it and retain the original delimiters of the non-deleted words?
Just to make it interesting, the input string can contain multiple spaces or tabs between words; only one comma, but that might also have spaces pre/suc-ceeding it :
manipulate ,widgets add , 1, 2 , 3
Note that words are not guaranteed to be unique, so I can't take the index of the word after those to be deleted and use it to return a positional substring.
[Update] I accepted 'Kasramvd solution, but then found that it didn't correctly handle remover('LET FOUR = 2 + 2', 2) or remover('A -1 B text.txt', 2), so now I am offering abounty.
[Update++] delimiters are space, tab and comma. Everything else (including equals sign, minus sign, etc) is part of a word (although I would be happy if answerers would tell me how to add a new delimiter in future, should it become necessary)
You can define a RegEx like this
>>> import re
>>> regEx = re.compile(r'(\s*,?\s*)')
it means that, an optional comma followed or preceded by zero or more whitespace characters. The parenthesis is to create a matching group, which would retain the separators during the split.
Now split based on the RegEx and then skip the actual number of elements you don't want, along with the number of separators corresponding to those elements (for example, if you want to skip three elements, then there will be two separators between three elements. So you would want to remove the first five elements from the split data) and finally join them.
For example,
>>> def splitter(data, count):
... return "".join(re.split(regEx, data)[count + (count - 1):])
...
>>> splitter("manipulate,widgets,add,1,2,3", 2)
',add,1,2,3'
>>> splitter("manipulate widgets add,1,2,3", 3)
',1,2,3'
s1='manipulate widgets add,1,2,3'
# output desired ',1,2,3'
s2='manipulate,widgets,add,1,2,3'
# delete two words (manipulate,widgets) and leave ,add,1,2,3
s3='manipulate ,widgets add , 1, 2 , 3'
# delete 2 or 3 words
import re
# for illustration
print re.findall('\w+',s1)
print re.findall('\w+',s2)
print re.findall('\w+',s3)
print
def deletewords(s,n):
a= re.findall('\w+',s)
return ','.join(a[n:])
# examples for use
print deletewords(s1,1)
print deletewords(s2,2)
print deletewords(s3,3)
output:
['manipulate', 'widgets', 'add', '1', '2', '3']
['manipulate', 'widgets', 'add', '1', '2', '3']
['manipulate', 'widgets', 'add', '1', '2', '3']
widgets,add,1,2,3
add,1,2,3
1,2,3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With