As the document goes:
This is called a negative lookbehind assertion. Similar to positive lookbehind assertions, the contained pattern must only match strings of some fixed length.
So this will work, the intention is to match any ,
outside {}
, but not inside {}
:
In [188]:
re.compile("(?<!\{)\,.").findall('a1,a2,a3,a4,{,a6}')
Out[188]:
[',a', ',a', ',a', ',{']
this will work, on a slightly different query:
In [189]:
re.compile("(?<!\{a5)\,.").findall('a1,a2,a3,a4,{a5,a6}')
#or this: re.compile("(?<!\{..)\,.").findall('a1,a2,a3,a4,{a5,a6}')
Out[189]:
[',a', ',a', ',a', ',{']
In [190]:
But if the query is 'a1,a2,a3,a4,{_some_length_not_known_in_advance,a6}'
, according to the document the following won't work as intended:
In [190]:
re.compile("(?<![\{.*])\,.").findall('a1,a2,a3,a4,{a5,a6}')
Out[190]:
[',a', ',a', ',a', ',{', ',a']
Any alternative to achieve this? Is negative lookbehind the wrong approach?
Any reason this is how lookbehind was designed to do (only match strings of some fixed length) in the first place?
In negative lookbehind the regex engine first finds a match for an item after that it traces back and tries to match a given item which is just before the main match. In case of a successful traceback match the match is a failure, otherwise it is a success.
The positive lookbehind ( (? <= ) ) and negative lookbehind ( (? <! ) ) zero-width assertions in JavaScript regular expressions can be used to ensure a pattern is preceded by another pattern.
Lookbehind, which is used to match a phrase that is preceded by a user specified text. Positive lookbehind is syntaxed like (? <=a)something which can be used along with any regex parameter. The above phrase matches any "something" word that is preceded by an "a" word.
Negative lookbehinds seem to be the only answer, but JavaScript doesn't has one. Consider posting the regex as it would look with a negative lookbehind; that may make it easier to respond. @WiktorStribiżew : Look-behinds were added in the 2018 spec. Chrome supports them, but Firefox still hasn't implemented the spec.
Any alternative to achieve this?
Yes. There is a a brilliantly simple technique, and this situation is very similar to "regex-match a pattern unless..."
Here's your simple regex:
{[^}]*}|(,)
The left side of the alternation |
matches complete { brackets }
tags. We will ignore these matches. The right side matches and captures commas to Group 1, and we know they are the right commas because they were not matched by the expression on the left.
Here is a demo that performs several tasks, so you can pick and choose (see the output at the bottom of the demo):
SplitHere
so we can perform task 4...Reference
How to match (or replace) a pattern except in situations s1, s2, s3...
Instead of using Negative Lookbehind, you can use Negative Lookahead with balanced braces.
,(?![^{]*\})
For example:
>>> re.findall(r',..(?![^{]*\})', 'a1,a2,a3,a4,{_some_unknown_length,a5,a6,a7}')
[',a2', ',a3', ',a4']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With