Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to exclude specific characters

Tags:

regex

I have a regex formula that I'm using to find specific patterns in my data. Specifically, it starts by looking for characters between "{}" brackets, and looks for "p. " and grabs the number after. I noticed that, in some instances, if there's not a "p. " value shortly after the brackets, it will continue to go through the next brackets and grab the number after.

For example, here is my sample data:

{Hello}, [1234] (Test). This is sample data used to answer a question {Hello2} [Ch.8 p. 87 gives more information about...

Here is my code:

\{(.*?)\}(.*?)p\. ([0-9]+)

I want it to return this only:

{Hello2}  [Ch.8 p. 87

But it returns this:

{Hello},  [123:456] (Test).  This is stample data used to answer a
question {Hello2}  [Ch.8 p. 87

Is there a way to exclude strings that contain, let's say, "{"?

like image 307
malibu2 Avatar asked Jun 12 '19 20:06

malibu2


1 Answers

Your pattern first matches from { till } and then matches in a non greedy way .*? giving up matches until it can match a p, dot space and 1+ digits.

It can do that because the dot can also match {}.

You could use negated character classes [^{}] to not match {}

\{[^{}]*\}[^{}]+p\. [0-9]+

Regex demo

like image 182
The fourth bird Avatar answered Sep 20 '22 16:09

The fourth bird