I have a string something like this
"quick" "brown" fox jumps "over" "the" lazy dog
I need a regex to detect words not enclosed in double quotes. After some random tries I found this ("([^"]+)")
. This detects a string enclosed in double quotes. But I want the opposite. I really can't come up with it even after trying to reverse the above mentioned regex. I am quite weak in regex. Please help me
Use lookahead/lookbehind assertions:
(?<![\S"])([^"\s]+)(?![\S"])
Example:
>>> import re
>>> a='"quick" "brown" fox jumps "over" "the" lazy dog'
>>> print re.findall('(?<![\S"])([^"\s]+)(?![\S"])',a)
['fox', 'jumps', 'lazy', 'dog']
The main thing here is lookahead/lookbehind assertions. You can say: I want this symbol before the expression but I don't want it to be a part of the match itself. Ok. For that you use assertions:
(?<![\S"])abc
That is a negative lookbehind. That means you want abc
but without [\S"]
before it, that means there must be no non-space character (beginning of the word) or "
before.
That is the same but in the other direction:
abc(?![\S"])
That is a negative lookahead. That means you want abc
but without [\S"]
after it.
There are four differenet assertions of the type in general:
(?=pattern)
is a positive look-ahead assertion
(?!pattern)
is a negative look-ahead assertion
(?<=pattern)
is a positive look-behind assertion
(?<!pattern)
is a negative look-behind assertion
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With