I'd appreciate a bit of help reading/interpreting this regular express (Python syntax):
(?:^|[b_./-])[Tt]est
(It is the default RE use by nosetest as a filter when looking for test files). Described here in the 'Extended usage' section.
EDIT: If you came here due to looking into nose, you may want to look into pytest as an alternative.
My understanding so far: The open-paren question-mark colon stuff close paren is an 'extension' that means to exclude the stuff from the string in the match result. Which (from my point of view as just trying to understand whether a given filename meets or fails the expression) means I can ignore it(?)
The last part '[Tt]est' means Test or test.
The meaning of the rest is hazy. The caret means 'match start of string', the vertical bar means OR, and the chars in brackets (b, underscore, period, slash, minus) are the alternative match. In other words match start-of-string or one of the 5 specified characters, followed by Test or test? Which would mean the strings 'bTest' and '/Test' would match (and they apparently do not).
Thanks for any helping in improving my interpretation of the pattern!
(?:...) is a non-capturing group; the same thing as (...) but not resulting in a captured group value. It limits what is included in the | alternate groups. The group either matches the start of the string, or one of the characters b, _, ., / or -.
So the expression produces matches for input text containing Test or test at the start of a line or if directly preceded by b, and underscore, a dot, a slash or a dash.
'bTest' and /Test do match:
>>> pattern = re.compile(r'(?:^|[b_./-])[Tt]est')
>>> pattern.search('Test').group()
'Test'
>>> pattern.search('Hello bTest').group()
'bTest'
>>> pattern.search('Hello /Test').group()
'/Test'
The b is surprising, so I had a look at the source code. The documentation is missing a backslash, not b is matched, but \b:
r'(?:^|[\b_\.%s-])[Tt]est' % os.sep)
\b is a word boundary, anything not a word character preceding the test or Test. This is most likely a bug; \b cannot be part of a character class, the \b there doesn't work. Instead, it'll be seen as a backspace character.
This issue was reported to the project 3 1/2 years ago.
Non capturing group (?: ... ) has no effect to match of regular expression
(?:^|[b_./-])[Tt]est

...so at the beginning of a line it would match:
test
Test
or one of the below options anywhere in the text:
bTest
btest
_Test
_test
.Test
.test
/Test
/test
-Test
-test
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With