Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to find numbers in a string

Tags:

java

regex

I'm using removeNumbers to remove all numbers in a given string with the regex
"(^| )\\d+($|( \\d+)+($| )| )"

Here's the code:

public class Regex {    
  private static String removeNumbers(String s) {
     s = s.trim();
     s = s.replaceAll(" +", " ");
     s = s.replaceAll("(^| )\\d+($|( \\d+)+($| )| )", " ");
     return s.trim();
  }

  public static void main(String[] args) {
     String[] tests = new String[] {"123", "123 456 stack 789", "123 456 789 101112 131415 161718 192021", "stack 123 456 overflow 789 com", "stack 123 456 overflow 789", "123stack 456", "123 stack456overflow", "123 stack456", "123! @456#567"};
     for (int i = 0; i < tests.length; i++) {
        String test = tests[i];
        System.out.println("\"" + test + "\" => \"" + removeNumbers(test) + "\"");
     }  
  }    
}

Output :

"123" => ""
" 123 " => ""
"123 456 stack 789" => "stack"
"123 456 789 101112 131415 161718 192021" => ""
"stack 123 456 overflow 789 com" => "stack overflow com"
"stack 123 456 overflow 789" => "stack overflow"
"123stack 456" => "123stack"
"123 stack456overflow" => "stack456overflow"
"123 stack456" => "stack456"
"123! @456#567" => "123! @456#567"

Is there any better way to do this?

Edit :

As suggested by @mbomb007 in his previous answer, the regex "( |^)[\\d ]+( |$)" works as well:

private static String removeNumbers(String s) {
   s = s.trim();
   s = s.replaceAll(" +", " ");
   s = s.replaceAll("( |^)[\\d ]+( |$)", " ");
   return s.trim();
}
like image 346
Bharat Khatri Avatar asked Apr 30 '15 18:04

Bharat Khatri


People also ask

How do you find a number in a string?

To find whether a given string contains a number, convert it to a character array and find whether each character in the array is a digit using the isDigit() method of the Character class.

How do I find a number in regex?

\d for single or multiple digit numbers To match any number from 0 to 9 we use \d in regex. It will match any single digit number from 0 to 9. \d means [0-9] or match any number from 0 to 9. Instead of writing 0123456789 the shorthand version is [0-9] where [] is used for character range.

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string). Both are called anchors and ensure that the entire string is matched instead of just a substring.


2 Answers

AFAIU, you can just do:

private static String removeNumbers(String s) {
    return s.replaceAll("\\b\\d+\\b", "").replaceAll(" +", " ").trim();
}

\b\d+\b matches one or more digits that form a word.

EDIT:

Since the pattern must not match numbers in a string like "123! @456#567", a combination of positive lookbehind and lookahead conditions can be used:

private static String removeNumbers(String s) {
    return s.replaceAll("(?<= |^)\\d+(?= |$)", " ").replaceAll(" +", " ").trim();
}
like image 115
M A Avatar answered Sep 26 '22 04:09

M A


Your regex is a bit redundant (and also doesn't quite fit your test cases). You can use this:

"\\b[ ]*(?<![^\\d\\s])[\\d]+(?![^\\d\\s])[ ]*\\b"

The \b escape character represents a word border (start or end of a word). I also use [ ]* to ensure the spaces between numbers get removed. This regex also allows words to contain numbers without them getting replaced. Just like you want.

EDIT: I added a negative lookbehind and a positive lookahead.

(?<![^\\d\\s]) - This ensures that the characters immediately preceding the digits are only more digits or spaces.

(?![^\\d\\s]) - This ensures that the characters immediately following the digits are only more digits or spaces.

Try it here with your test cases. (Updated the hyperlink for added test case)

like image 27
mbomb007 Avatar answered Sep 24 '22 04:09

mbomb007