How do I extract all the characters (including newline characters) until the first occurrence of the giver sequence of words? For example with the following input:
input text:
"shantaram is an amazing novel.
It is one of the best novels i have read.
the novel is written by gregory david roberts.
He is an australian"
And the sequence the
I want to extract text from shantaram
to first occurrence of the
which is in the second line.
The output must be-
shantaram is an amazing novel.
It is one of the
I have been trying all morning. I can write the expression to extract all characters until it encounters a specific character but here if I use an expression like:
re.search("shantaram[\s\S]*the", string)
It doesn't match across newline.
Use this regex,
re.search("shantaram[\s\S]*?the", string)
instead of
re.search("shantaram[\s\S]*the", string)
The only difference is '?'. By using '?'(e.g. *?, +?), you can prevent longest matching.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With