I have a URI that contains non-ASCII characters like :
http://www.abc.de/qq/qq.ww?MIval=typo3_bsl_int_Smtliste&p_smtbez=Schmalbl�ttrigeSomerzischeruchtanb
How can I remove "�" from this URI
Use . replace() method to replace the Non-ASCII characters with the empty string.
replaceAll("\\p{Cntrl}", "?"); The following will replace all ASCII non-printable characters (shorthand for [\p{Graph}\x20] ), including accented characters: my_string. replaceAll("[^\\p{Print}]", "?");
replaceAll("\\p{C}", "?"); This will replace all non-printable characters. Where p{C} selects the invisible control characters and unused code points. Save this answer.
I'm guessing that the source of the URL is more at fault. Perhaps you're fixing the wrong problem? Removing "strange" characters from a URI might give it an entirely different meaning.
With that said, you may be able to remove all of the non-ASCII characters with a simple string replacement:
String fixed = original.replaceAll("[^\\x20-\\x7e]", "");
Or you can extend that to all non-four-byte-UTF-8 characters if that doesn't cover the "�" character:
String fixed = original.replaceAll("[^\\u0000-\\uFFFF]", "");
yourstring=yourstring.replaceAll("[^\\p{ASCII}]", "");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With