Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

java socket writeUTF() and readUTF()

Tags:

java

sockets

I've been reading some Java socket code snippet and fonund out a fact that in socket communication, to send messages in sequence, you don't have to seperate them by hand, the writer/reader stream do the things automatically for you. Here is an example:

writer.java
writeUTF("Hello");
writeUTF("World");


reader.java
String a=readUTF(); // a=Hello
String a=readUTF(); // b=World

I've tried this code snippet and it works fine. However, I'm wondering whether this kind of coding style is supposed to be working fine. Is there any potential risks of using the socket stream in sequence without explicitly seperating each segment?

like image 464
Longbiao CHEN Avatar asked Oct 24 '10 16:10

Longbiao CHEN


2 Answers

The writeUTF() and readUTF() write the length of the String (in bytes, when encoded as UTF-8) followed by the data, and use a modified UTF-8 encoding. So there are some potential problems:

  • The maximum length of Strings that can be handled this way is 65535 for pure ASCII, less if you use non-ASCII characters - and you cannot easily predict the limit in that case, other than conservatively assuming 3 bytes per character. So if you're sure you'll never send Strings longer than about 20k, you'll be fine.
  • If the app ever needs to communicate with something else (that's not written in Java), the other side may have a hard time handling the modified UTF-8. For application-internal communication, you don't have to worry though.
like image 66
Michael Borgwardt Avatar answered Sep 18 '22 11:09

Michael Borgwardt


According to the documentation the readUTF and writeUTF methods work with a modified version of UTF8 that also adds the length of the character to be read in the beginnig.

This should mean that the read operation will wait until enough characters had been fetched before returning the string.. this means they are actually segmented also if you don't see it since you merely decorate the streams of the socket with the DataInputStream and DataOutputStream.

In conclusion, yes, it should be quite safe, since the API itself will take care of separating the single messages.

like image 36
Jack Avatar answered Sep 19 '22 11:09

Jack