I want to split strings using a comma delimiter if the comma is preceded by a certain regex. Consider the case where my strings are in the format: "(bunch of stuff that might have commas) FOO_REGEX, (other stuff that might have commas) FOO_REGEX, ..." and I want to split the string on commas, but only if they're preceded by FOO_REGEX: ["(bunch of stuff that might have commas) FOO_REGEX", "(other stuff that might have commas) FOO_REGEX", tc.].
As a concrete example, consider splitting the following string:
"hi, hello! $$asdf, I am foo, bar $$jkl, cool"
into this list of three strings:
["hi, hello! $$asdf",
"I am foo, bar $$jkl",
"cool"]
Is there any easy way to do this in python?
You could use re.findall
instead of re.split
.
>>> import re
>>> s = "hi, hello! $$asdf, I am foo, bar $$jkl, cool"
>>> [j for i in re.findall(r'(.*?\$\$[^,]*),\s*|(.+)', s) for j in i if j]
['hi, hello! $$asdf', 'I am foo, bar $$jkl', 'cool']
OR
Use external regex
module to support variable length lookbehind since re
won't support variable length look-behind assertions.
>>> import regex
>>> s = "hi, hello! $$asdf, I am foo, bar $$jkl, cool"
>>> regex.split(r'(?<=\$\$[^,]*),\s*', s)
['hi, hello! $$asdf', 'I am foo, bar $$jkl', 'cool']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With