Suppose I have the following Python string
str = """
....
Dummyline
Start of matching
+----------+----------------------------+
+   test   +           1234             +
+   test2  +           5678             +
+----------+----------------------------+
Finish above. Do not match this
+----------+----------------------------+
+  dummy1  +       00000000000          +
+  dummy2  +       12345678910          +
+----------+----------------------------+
"""
and I want to match everything that the first table has. I could use a regex that starts matching from
"Start"
and matches everything until it finds a double newline
\n\n
I found some tips on how to do this in another stackoverflow post (How to match "anything up until this sequence of characters" in a regular expression?), but it doesn't seem to be working for the double newline case.
I thought of the following code
pattern = re.compile(r"Start[^\n\n]")
matches = pattern.finditer(str)
where basically
[^x]
means match everything until character x is found. But this works only for characters, not with strings ("\n\n" in this case)
Anybody has any idea on it?
You can match Start  until the end of the lines, and then  match all lines that start with a newline and are not immediately followed by a newline using a negative lookahead (?!
^Start .*(?:\r?\n(?!\r?\n).*)*
Explanation
^Start .* Match Start  from the start of the string ^ and 0+ times any char except a newline(?: Non capture group
\r?\n Match a newline(?!\r?\n) Negative lookahead, assert what is directly to the right is not a newline.* Match 0+ times any character except a newline)* Close the non capturing group and repeat 0+ times to get all the linesRegex demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With