Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to force UTF-16 while reading/writing in Java?

I see that you can specify UTF-16 as the charset via Charset.forName("UTF-16"), and that you can create a new UTF-16 decoder via Charset.forName("UTF-16").newDecoder(), but I only see the ability to specify a CharsetDecoder on InputStreamReader's constructor.

How so how do you specify to use UTF-16 while reading any stream in Java?

like image 434
IAmYourFaja Avatar asked Feb 26 '13 20:02

IAmYourFaja


1 Answers

Input streams deal with raw bytes. When you read directly from an input stream, all you get is raw bytes where character sets are irrelevant.

The interpretation of raw bytes into characters, by definition, requires some sort of translation: how do I translate from raw bytes into a readable string? That "translation" comes in the form of a character set.

This "added" layer is implemented by Readers. Therefore, to read characters (rather than bytes) from a stream, you need to construct a Reader of some sort (depending on your needs) on top of the stream. For example:

InputStream is = ...;
Reader reader = new InputStreamReader(is, Charset.forName("UTF-16"));

This will cause reader.read() to read characters using the character set you specified. If you would like to read entire lines, use BufferedReader on top:

BufferedReader reader = new BufferedReader(new InputStreamReader(is, Charset.forName("UTF-16")));
String line = reader.readLine();
like image 52
Isaac Avatar answered Sep 21 '22 14:09

Isaac