Logo Questions Linux Laravel Mysql Ubuntu Git Menu

remove all special characters in java [duplicate]




Possible Duplicate:
Replacing all non-alphanumeric characters with empty strings

import java.util.Scanner;
import java.util.regex.*;
public class io{
public static void main(String args[]){
Scanner scan = new Scanner(System.in);
String c;
Pattern pt = Pattern.compile("[^a-zA-Z0-9]");
Matcher match= pt.matcher(c);

Case 1

Input : hjdg$h&jk8^i0ssh6
Expect : hjdghjk8i0ssh6
Output : hjdgh&jk8^issh6

Case 2

Input : hjdgh&jk8i0ssh6
Expect : hjdghjk8i0ssh6
Output : hjdghjk8i0ssh6

Case 3

Input : hjdgh&j&k8i0ssh6
Expect : hjdghjk8i0ssh6
Output : hjdghjki0ssh6

Anyone please help me to figure out, what is wrong in my code logic ??

like image 837
Ravi Avatar asked Jan 16 '13 15:01


People also ask

How do I remove all special characters from a string in Java?

You can use a regular expression and replaceAll() method of java. lang. String class to remove all special characters from String.

What is replaceAll \\ s in Java?

Java String replaceAll() The replaceAll() method replaces each substring that matches the regex of the string with the specified text.

How do you remove duplicate characters from a string?

Using Sorting Alternatively, repeated characters can be eliminated by sorting our input string to group duplicates. In order to do that, we have to convert the string to a char array and sort it using the Arrays. sort method.

3 Answers

You can read the lines and replace all special characters safely this way.
Keep in mind that if you use \\W you will not replace underscores.

Scanner scan = new Scanner(System.in);

    System.out.println(scan.nextLine().replaceAll("[^a-zA-Z0-9]", ""));
like image 137
rtheunissen Avatar answered Sep 30 '22 22:09


use [\\W+] or "[^a-zA-Z0-9]" as regex to match any special characters and also use String.replaceAll(regex, String) to replace the spl charecter with an empty string. remember as the first arg of String.replaceAll is a regex you have to escape it with a backslash to treat em as a literal charcter.

          String c= "hjdg$h&jk8^i0ssh6";
        Pattern pt = Pattern.compile("[^a-zA-Z0-9]");
        Matcher match= pt.matcher(c);
            String s= match.group();
        c=c.replaceAll("\\"+s, "");
like image 33
PermGenError Avatar answered Sep 30 '22 23:09


Your problem is that the indices returned by match.start() correspond to the position of the character as it appeared in the original string when you matched it; however, as you rewrite the string c every time, these indices become incorrect.

The best approach to solve this is to use replaceAll, for example:

        System.out.println(c.replaceAll("[^a-zA-Z0-9]", ""));
like image 44
Sébastien Le Callonnec Avatar answered Sep 30 '22 22:09

Sébastien Le Callonnec