I have the following string
'abc[123]defgh ijk[456]lm no[78] pq'
And I would like to extract all parts which are either between the begin of the string and [
or between whitespace and [
. For the given string, these are the parts 'abc'
, 'ijk'
, and 'no'
.
I have the following expression
exp = re.compile(r'\s(.*?)\[')
But I cannot figure out how to add the beginning of the string as an optional expression. How do I have to write the expression to cover both cases?
Python Re Start-of-String (^) Regex. You can use the caret operator ^ to match the beginning of the string. For example, this is useful if you want to ensure that a pattern appears at the beginning of a string.
\s | Matches whitespace characters, which include the \t , \n , \r , and space characters. \S | Matches non-whitespace characters.
The meta character “^” matches the beginning of a particular string i.e. it matches the first character of the string. For example, The expression “^\d” matches the string/line starting with a digit. The expression “^[a-z]” matches the string/line starting with a lower case alphabet.
[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .
Try this pattern:
(?:^|\s)(.*?)\[
The start anchor (^
) matches the beginning of the string (or line in MULTILINE
mode).
Another: after finding the starting character, look for everything that is NOT a [ and ensure it is followed by a [
(?:^|\s)([^\[]+)(?=\[)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With