Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove non-ASCII characters from String in Java

Tags:

I have a URI that contains non-ASCII characters like :

http://www.abc.de/qq/qq.ww?MIval=typo3_bsl_int_Smtliste&p_smtbez=Schmalbl�ttrigeSomerzischeruchtanb

How can I remove "�" from this URI

like image 634
M.M Avatar asked May 13 '12 18:05

M.M


People also ask

How do I remove a non ASCII character from a string?

Use . replace() method to replace the Non-ASCII characters with the empty string.

How do I remove non-printable characters from a string in Java?

replaceAll("\\p{Cntrl}", "?"); The following will replace all ASCII non-printable characters (shorthand for [\p{Graph}\x20] ), including accented characters: my_string. replaceAll("[^\\p{Print}]", "?");

How do I remove hidden characters from a string in Java?

replaceAll("\\p{C}", "?"); This will replace all non-printable characters. Where p{C} selects the invisible control characters and unused code points. Save this answer.


2 Answers

I'm guessing that the source of the URL is more at fault. Perhaps you're fixing the wrong problem? Removing "strange" characters from a URI might give it an entirely different meaning.

With that said, you may be able to remove all of the non-ASCII characters with a simple string replacement:

String fixed = original.replaceAll("[^\\x20-\\x7e]", "");

Or you can extend that to all non-four-byte-UTF-8 characters if that doesn't cover the "�" character:

String fixed = original.replaceAll("[^\\u0000-\\uFFFF]", "");
like image 96
Cᴏʀʏ Avatar answered Oct 12 '22 23:10

Cᴏʀʏ


yourstring=yourstring.replaceAll("[^\\p{ASCII}]", "");
like image 37
daneshkohan Avatar answered Oct 13 '22 00:10

daneshkohan