I have strings "A função"
, "Ãugent"
in which I need to replace characters like ç
, ã
, and Ã
with empty strings.
How can I remove those non-ASCII characters from my string?
I have attempted to implement this using the following function, but it is not working properly. One problem is that the unwanted characters are getting replaced by the space character.
public static String matchAndReplaceNonEnglishChar(String tmpsrcdta) { String newsrcdta = null; char array[] = Arrays.stringToCharArray(tmpsrcdta); if (array == null) return newsrcdta; for (int i = 0; i < array.length; i++) { int nVal = (int) array[i]; boolean bISO = // Is character ISO control Character.isISOControl(array[i]); boolean bIgnorable = // Is Ignorable identifier Character.isIdentifierIgnorable(array[i]); // Remove tab and other unwanted characters.. if (nVal == 9 || bISO || bIgnorable) array[i] = ' '; else if (nVal > 255) array[i] = ' '; } newsrcdta = Arrays.charArrayToString(array); return newsrcdta; }
In python, to remove non-ASCII characters in python, we need to use string. encode() with encoding as ASCII and error as ignore, to returns a string without ASCII character use string. decode().
Step 1: Click on any cell (D3). Enter Formula =CLEAN(C3). Step 2: Click ENTER. It removes non-printable characters.
replaceAll("\\p{Cntrl}", "?"); The following will replace all ASCII non-printable characters (shorthand for [\p{Graph}\x20] ), including accented characters: my_string.
This will search and replace all non ASCII letters:
String resultString = subjectString.replaceAll("[^\\x00-\\x7F]", "");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With