Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex - Pattern until double \n

Tags:

regex

Somehow I am not able to find anything online about how to set a pattern ending to a double \n. My particular case is the following. I have this string:

"1 Matt\n00:00:00,100 --> 00:00:01,500\nThis is said \nby Matt.\n\n2 Lucas\n00:00:01,700 --> 00:00:02,300\nWhile this is said by Lucas"

And I would like to extract only the texts between digit\n and \n\n. So, in my case, I'd like to have

This is said \nby Matt.
While this is said by Lucas

Although I am not very skilled with RegEx, I tried many combinations such as ?<=\d\n).*?(?=\n\n), ?<=\d\n).\n\n and ?<=\d\n).*?(?=\r\n\r\n) but without any luck.

I have tried those as well as others with R's stringr library, but also with python's re. The issue first came up in this answer: https://stackoverflow.com/a/72547966/19284124

like image 804
aooo Avatar asked Feb 07 '26 07:02

aooo


1 Answers

You can make the . match across lines with the (?s) inline modifier and extend the double newline pattern to alternatively match the end of string:

(?s)(?<=\d\n).*?(?=\n\n|\Z)

See the regex demo.

Details:

  • (?s) - a flag allowing . match line break chars
  • (?<=\d\n) - a positive lookbehind that matches a location that is immediately preceded with a digit and a newline
  • .*? - any zero or more chars, as few as possible
  • (?=\n\n|\Z) - a positive lookahead that matches a location that is immediately followed with two newline chars or end of string.
like image 85
Wiktor Stribiżew Avatar answered Feb 09 '26 11:02

Wiktor Stribiżew



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!