Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find Control Characters written in bytes in Java

I had a question about the control characters. I have to found them in a string and delete them. Made some research and found useful tips.

I wrote this:

output.toString().replaceAll("[\\p{Cntrl}\\p{Cc}]","")

But I was asked if this method can find the control characters if they are written in bytes. To be honest, I have no idea. Try to look on the net, but don't know how I can test it.

Thanks

like image 493
Tony Avatar asked Oct 21 '22 08:10

Tony


1 Answers

Yes, the characters will be removed, see next code:

byte[] chars = { 'h', 'e', 10, 15, 21, 'l', 'l', 'o', 13 };
String str = new String(chars, "utf8");
System.out.println("==========");
System.out.println(str);
System.out.println("==========");
System.out.println(str.replaceAll("[\\p{Cntrl}\\p{Cc}]", ""));
System.out.println("==========");

The output for that code would be:

 ==========
 he
 llo
 ==========
 hello
 ==========

Once the special character is included in an String object it doesn't matter if was created from a byte[] or whatever else object, It's stored always in the same format.

like image 91
Roberto Avatar answered Oct 23 '22 00:10

Roberto