I wish to remove all non-printable ascii characters from a string while retaining invisible ones. I thought this would work because whitespace, \n \r are invisible characters but not non-printable? Basically I am getting a byte array with � characters in it and I don't want them to be in it. So i am trying to convert it to a string, remove the � characters before using it as a byte array again.
Space works fine in my code now, however now \r and \n do not work. What would be the correct regex to retain these also? Or is there a better way that what I am doing?
public void write(byte[] bytes, int offset, int count) {
try {
String str = new String(bytes, "ASCII");
str2 = str.replaceAll("[^\\p{Print}\\t\\n]", "");
GraphicsTerminalActivity.sendOverSerial(str2.getBytes("ASCII"));
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
return;
}
}
EDIT: I tried [^\x00-\x7F] which is the range of ascii characters....but then the � symbols still get through, weird.
US-ASCII is a character set (and an encoding) with some notable features: Values are between 0–127 (x00–x7F) ASCII code-point 32 (decimal) represents a SPACE. ASCII code-point 65 represents the uppercase letter A.
The following regex will only match printable text
[^\x00\x08\x0B\x0C\x0E-\x1F]*
The following Regex will find non-printable characters
[\x00\x08\x0B\x0C\x0E-\x1F]
Jave Code:
boolean foundMatch = false;
try {
Pattern regex = Pattern.compile("[\\x00\\x08\\x0B\\x0C\\x0E-\\x1F]");
Matcher regexMatcher = regex.matcher(subjectString);
foundMatch = regexMatcher.find();
//Relace the found text with whatever you want
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With