Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove invalid characters from a string?

Tags:

java

android

I have no idea how to remove invalid characters from a string in Java. I'm trying to remove all the characters that are not numbers, letters, or ( ) [ ] . How can I do this?

Thanks

like image 398
arberb Avatar asked Feb 04 '12 20:02

arberb


4 Answers

String foo = "this is a thing with & in it";
foo = foo.replaceAll("[^A-Za-z0-9()\\[\\]]", "");

Javadocs are your friend. Regular expressions are also your friend.

Edit:

That being siad, this is only for the Latin alphabet; you can adjust accordingly. \\w can be used for a-zA-Z to denote a "word" character if that works for your case though it includes _.

like image 188
Brian Roach Avatar answered Nov 03 '22 16:11

Brian Roach


Using Guava, and almost certainly more efficient (and more readable) than regexes:

CharMatcher desired = CharMatcher.JAVA_DIGIT
  .or(CharMatcher.JAVA_LETTER)
  .or(CharMatcher.anyOf("()[]"))
  .precomputed(); // optional, may improve performance, YMMV
return desired.retainFrom(string);
like image 20
Louis Wasserman Avatar answered Nov 03 '22 14:11

Louis Wasserman


Try this:

String s = "123abc&^%[]()";
s = s.replaceAll("[^A-Za-z0-9()\\[\\]]", "");
System.out.println(s);

The above will remove characters "&^%" in the sample string, leaving in s only "123abc[]()".

like image 34
Óscar López Avatar answered Nov 03 '22 14:11

Óscar López


public static void main(String[] args) {
    String c = "hjdg$h&jk8^i0ssh6+/?:().,+-#";
    System.out.println(c);
    Pattern pt = Pattern.compile("[^a-zA-Z0-9/?:().,'+/-]");
    Matcher match = pt.matcher(c);
    if (!match.matches()) {
        c = c.replaceAll(pt.pattern(), "");
    }
    System.out.println(c);
}
like image 37
Raj Avatar answered Nov 03 '22 16:11

Raj