Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex numbers from string

Tags:

regex

I am trying to write a regex that can find only numbers from given string. What I mean is:

Input: My number is +12 345 678. I have galaxy s3, its symbol 34abc.

Output: 345 and 678 (but not +12, 3 from word s3 or 34 from 34abc)

I tried just numbers (\d+) and I combinations with white and words characters. The closest was^\d$ but that doesn't work as my numbers are part of the bigger string, not whole string themselves. Can you give me a hint?

------- EDIT

Looks like I just don't know how to check a character without actually getting it into result. Like "digit that follow space character (without this space)".

like image 383
Malvinka Avatar asked Mar 02 '26 20:03

Malvinka


1 Answers

In general case, you can make use of lookbehind and lookahead:

(?<=^|\s)\d+(?=$|\s)

The part which makes it into the captured output is \d+. Lookbehind and lookahead are not included in the match.

I just included spaces as delimiters in the regex, but you may replace \s with any character class, as defined by your requirements. For example, to allow dots as separators (both in front and after the digits), use the following regex:

(?<=^|[\s.])\d+(?=$|[\s.])

The (?<=^|\s) should be read as follows:

  • (?<= ... ) defines the lookbehind group.
  • The expression which must precede the \d+ is ^|\s, meaning "either start of the line (^) or whitespace".

Similarly, (?=$|\s) defines the lookahead group (it must follow the captured digits), which is either end of the line ($) or whitespace.


A note on \b mentioned in other answers: it is a nice feature, means "word boundary", but the "word characters" are not customizable. This means that, for example, the "+" character is considered to be a separator and you can't change this if you use \b. With lookaround, you can customize the separators to your needs.

like image 81
Alex Shesterov Avatar answered Mar 05 '26 21:03

Alex Shesterov