Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

split a comma, space, or semicolon separated string using regex

Tags:

python

regex

I use the regex [,;\s]+ to split a comma, space, or semicolon separated string. This works fine if the string doesn't have a comma at the end:

>>> p=re.compile('[,;\s]+')
>>> mystring='a,,b,c'
>>> p.split(mystring)
['a', 'b', 'c']

When the string has a comma at the end:

>>> mystring='a,,b,c,'
>>> p.split(mystring)
['a', 'b', 'c', '']

I want the output in this case to be ['a', 'b', 'c'].

Any suggestions on the regex?

like image 655
ghostcoder Avatar asked Nov 30 '22 06:11

ghostcoder


2 Answers

Try:

str = 'a,,b,c,'
re.findall(r'[^,;\s]+', str)
like image 113
Qtax Avatar answered Feb 26 '23 01:02

Qtax


Here's something very low tech that should still work:

mystring='a,,b,c'
for delim in ',;':
    mystring = mystring.replace(delim, ' ')
results = mystring.split()

PS: While regexes are very useful, I would strongly suggest thinking twice about whether it is the right tool for the job here. While I'm not sure what the exact runtime of a compiled regex is (I'm thinking at most O(n^2)), it is definitely not faster than O(n), which is the runtime of string.replace. So unless there is a different reason for which you need to use a regex, you should be set with this solution

like image 35
inspectorG4dget Avatar answered Feb 26 '23 01:02

inspectorG4dget