Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to first occurrence only? [duplicate]

Tags:

regex

Let's say I have the following string:

this is a test for the sake of testing. this is only a test. The end.

and I want to select this is a test and this is only a test. What in the world do I need to do?

The following Regex I tried yields a goofy result:

this(.*)test (I also wanted to capture what was between it)

returns this is a test for the sake of testing. this is only a test

It seems like this is probably something easy I'm forgetting.

like image 920
Ben Lesh Avatar asked Jan 15 '10 20:01

Ben Lesh


2 Answers

The regex is greedy meaning it will capture as many characters as it can which fall into the .* match. To make it non-greedy try:

this(.*?)test

The ? modifier will make it capture as few characters as possible in the match.

like image 126
Andy E Avatar answered Oct 20 '22 11:10

Andy E


Andy E and Ipsquiggle have the right idea, but I want to point out that you might want to add a word boundary assertion, meaning you don't want to deal with words that have "this" or "test" in them-- only the words by themselves. In Perl and similar that's done with the "\b" marker.

As it is, this(.*?)test would match "thistles are the greatest", which you probably don't want.

The pattern you want is something like this: \bthis\b(.*?)\btest\b

like image 9
Platinum Azure Avatar answered Oct 20 '22 11:10

Platinum Azure