Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex: how to match an word that doesn't end with a specific character

I would like to match the whole "word"—one that starts with a number character and that may include special characters but does not end with a '%'.

Match these:

  • 112 (whole numbers)
  • 10-12 (ranges)
  • 11/2 (fractions)
  • 11.2 (decimal numbers)
  • 1,200 (thousand separator)

but not

  • 12% (percentages)
  • A38 (words starting with a alphabetic character)

I've tried these regular expressions:

(\b\p{N}\S)*)

but that returns '12%' in '12%'

(\b\p{N}(?:(?!%)\S)*)

but that returns '12' in '12%'

Can I make an exception to the \S term that disregards %? Or will have to do something else?

I'll be using it in PHP, but just write as you would like and I'll convert it to PHP.

like image 321
bonna Avatar asked Nov 18 '11 11:11

bonna


2 Answers

This matches your specification:

\b\p{N}\S*+(?<!%)

Explanation:

\b       # Start of number
\p{N}    # One Digit
\S*+     # Any number of non-space characters, match possessively
(?<!%)   # Last character must not be a %

The possessive quantifier \S*+ makes sure that the regex engine will not backtrack into a string of non-space characters it has already matched. Therefore, it will not "give back" a % to match 12 within 12%.

Of course, that will also match 1!abc, so you might want to be more specific than \S which matches anything that's not a whitespace character.

like image 105
Tim Pietzcker Avatar answered Sep 18 '22 17:09

Tim Pietzcker


Can i make an exception to the \S term that disregards %

Yes you can:

[^%\s]

See this expression \b\d[^%\s]* here on Regexr

like image 40
stema Avatar answered Sep 19 '22 17:09

stema