Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ObjectOutputStream methods: writeBytes(String str) vs writeUTF(String s);

What's the main difference between the two?

Still both of them are for writing Strings.

public void writeUTF(String str)
              throws IOException

Primitive data write of this String in modified UTF-8 format.

vs

public void writeBytes(String str)
                throws IOException

Writes a String as a sequence of bytes.

When should I use one rather than the other?

like image 469
Rollerball Avatar asked Aug 01 '13 14:08

Rollerball


3 Answers

It's in the documentation... from DataOutput.writeBytes(String):

Writes a string to the output stream. For every character in the string s, taken in order, one byte is written to the output stream. If s is null, a NullPointerException is thrown.

If s.length is zero, then no bytes are written. Otherwise, the character s[0] is written first, then s1, and so on; the last character written is s[s.length-1]. For each character, one byte is written, the low-order byte, in exactly the manner of the writeByte method . The high-order eight bits of each character in the string are ignored.

In other words, "Sod Unicode, we don't care about any characters not in ISO-8859-1. Oh, and we assume you don't care about the length of the string either."

Note that writeBytes doesn't even try to detect data corruption - if you write out a character which isn't in ISO-8859-1, it will just drop the high byte silently.

Just say no - writeUTF is your friend... assuming your string is less than 64K in length.

Of course, if you have a protocol you're trying to implement which itself requires a single-byte encoding (ISO-8859-1 or ASCII) and doesn't use a length, then writeBytes might be appropriate - but I'd personally probably perform the text-to-bytes conversion myself and then use write(byte[]) instead... it's clearer.

like image 115
Jon Skeet Avatar answered Nov 18 '22 02:11

Jon Skeet


If there's a possibility that your String is holding something that uses wide characters (basically anything beyond standard ASCII), use UTF. If your output is going to something that requires a one-byte-per-character encoding, such as header labels in many network protocols, use bytes.

like image 1
chrylis -cautiouslyoptimistic- Avatar answered Nov 18 '22 02:11

chrylis -cautiouslyoptimistic-


when data is stored using UTF it stores in Universal Character Set, so when you string data contains other than ASCII character use writeUTF, otherwise writeByte is ok.

like image 1
Vijay Avatar answered Nov 18 '22 02:11

Vijay