Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression that matches number in string but not percentages

I need to know if there is a regular expression for testing for the presence of numbers in strings that:

  • Matches Lorem 20 Ipsum
  • Matches Lorem 2,5 Ipsum
  • Matches Lorem 20.5 Ipsum
  • Does not match Lorem 2% Ipsum
  • Does not match Lorem 20.5% Ipsum
  • Does not match Lorem 20,5% Ipsum
  • Does not match Lorem 2 percent Ipsum
  • Does not match Lorem 20.5 percent Ipsum
  • Does not match Lorem 20,5 percent Ipsum
  • Matches Lorem 20 Ipsum 2% dolor
  • Matches Lorem 2,5 Ipsum 20.5% dolor
  • Matches Lorem 20.5 Ipsum 20,5% dolor

That is, a regular expression that can tell me if in a string there is one or many numbers, but not as percentage value.

I've tried something as /[0-9\.,]+[^%]/, but this not seems to work, I think because digits then not a percentage sign match also the 20 in the string 20%. Additionally, I don't know how to tell not the entire percent string in addition to the % char.

like image 217
lorenzo-s Avatar asked Nov 03 '12 19:11

lorenzo-s


1 Answers

This will do what you need:

\b                     -- word boundary
\d+                    -- one or more digits
(?:\.\d+)?             -- optionally followed by a period and one or more digits
\b                     -- word boundary
\s+                    -- one or more spaces
(?!%|percent)          -- NOT followed by a % or the word 'percent'

--EDIT--

The meat here is the use of a "negative lookahead" on the final line that causes the match to fail if any of a percent-sign or the literal "percent" occurs after a number and one or more spaces. Other uses of negative lookahead in JavaScript RegExps can be found at Negative lookahead Regular Expression

--2ND EDIT-- Congrats to Enrico for solving the most general case but while his solution below is correct, it contains several extraneous operators. Here is the most succinct solution.

(                         -- start capture
  \d+                     -- one or more digits
  (?:[\.,]\d+)?           -- optional period or comma followed by one or more digits
  \b                      -- word boundary
  (?!                     -- start negative lookahead
    (?:[\.,]\d+)          -- must not be followed by period or comma plus digits
  |                       --    or
    (?:                   -- start option group
      \s?%                -- optional space plus percent sign
    |                     --   or
      \spercent           -- required space and literal 'percent'
    )                     -- end option group
  )                       -- end negative lookahead
)                         -- end capture group
like image 97
Rob Raisch Avatar answered Sep 18 '22 12:09

Rob Raisch