Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to find last occurrence of pattern in a string

My string being of the form:

"as.asd.sd fdsfs. dfsd  d.sdfsd. sdfsdf sd   .COM"

I only want to match against the last segment of whitespace before the last period(.)

So far I am able to capture whitespace but not the very last occurrence using:

\s+(?=\.\w)

How can I make it less greedy?

like image 686
Seamus Avatar asked Jan 26 '17 09:01

Seamus


2 Answers

In a general case, you can match the last occurrence of any pattern using the following scheme:

pattern(?![\s\S]*pattern)
(?s)pattern(?!.*pattern)
pattern(?!(?s:.*)pattern)

where [\s\S]* matches any zero or more chars as many as possible. (?s) and (?s:.) can be used with regex engines that support these constructs so as to use . to match any chars.

In this case, rather than \s+(?![\s\S]*\s), you may use

\s+(?!\S*\s)

See the regex demo. Note the \s and \S are inverse classes, thus, it makes no sense using [\s\S]* here, \S* is enough.

Details:

  • \s+ - one or more whitespace chars
  • (?!\S*\s) - that are not immediately followed with any 0 or more non-whitespace chars and then a whitespace.
like image 82
Wiktor Stribiżew Avatar answered Oct 17 '22 08:10

Wiktor Stribiżew


You can try like so:

(\s+)(?=\.[^.]+$)

(?=\.[^.]+$) Positive look ahead for a dot and characters except dot at the end of line.

Demo:

https://regex101.com/r/k9VwC6/3

like image 42
Mohammad Yusuf Avatar answered Oct 17 '22 07:10

Mohammad Yusuf