Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to specify the character encoding to the java.lang.StringBuilder

Or am I stuck with :

String s = new String(new byte[0], Charset.forName("ISO-8859-1"));
// or ISO_8859_1, or LATIN-1 or ... still no constants for those
for (String string : strings) { // those are ISO-8959-1 encoded
    s += string; // hopefully this preserves the encoding (?)
}
like image 324
Mr_and_Mrs_D Avatar asked Jul 28 '13 11:07

Mr_and_Mrs_D


1 Answers

Strings are always UTF-16-encoded in Java. They're just sequences of char values, which are UTF-16 code units. When you specify the encoding to the String(byte[], String) constructor, it's just saying how to decode the bytes into text - the encoding is discarded afterwards.

If you need to preserve an encoding, you'll need to create your own class to keep a Charset and String together. I can't say that I've ever wanted to do that though - are you really sure you need to?

(So your "stuck with" code wouldn't work anyway - and it would also be inefficient.)

like image 100
Jon Skeet Avatar answered Oct 10 '22 16:10

Jon Skeet