I have a regex formula that I'm using to find specific patterns in my data. Specifically, it starts by looking for characters between "{}" brackets, and looks for "p. " and grabs the number after. I noticed that, in some instances, if there's not a "p. " value shortly after the brackets, it will continue to go through the next brackets and grab the number after.
For example, here is my sample data:
{Hello}, [1234] (Test). This is sample data used to answer a question {Hello2} [Ch.8 p. 87 gives more information about...
Here is my code:
\{(.*?)\}(.*?)p\. ([0-9]+)
I want it to return this only:
{Hello2} [Ch.8 p. 87
But it returns this:
{Hello}, [123:456] (Test). This is stample data used to answer a
question {Hello2} [Ch.8 p. 87
Is there a way to exclude strings that contain, let's say, "{"?
Your pattern first matches from { till } and then matches in a non greedy way .*?
giving up matches until it can match a p
, dot space and 1+ digits.
It can do that because the dot can also match {}
.
You could use negated character classes [^{}]
to not match {}
\{[^{}]*\}[^{}]+p\. [0-9]+
Regex demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With