Java: String.getBytes(Charset) Vs. Charset.encode(String) for use with OutputStream

Question

I am in the situation that my algorithm has the 2 inputs of:

1 utf8 String object that will be encoded
1 Charset object which indicates what I need to encode the string into

In the end, the returned result will be put into an OutputStream, an action which may happen multiple times, but at least once. There is no multithreading happening in this scenario.

I have found two solutions:

Calling getBytes(Charset) on the given String and supply the given Charset. This will return a byte[]
Calling encode(String) on the given Charset and supply the given String. This will return a ByteBuffer.

Delving into the code behind these methods shows a complex design for each underlying algorithm. I can't say I understand how to make a choice between these two options.

Is there a significant performance difference for calling either method?
Is there a significant performance difference for putting the result into the OutputStream?
Is there a significant difference in footprint?

Which solution would generally be a better choice?

Peter Lawrey · Accepted Answer

In both cases, a byte[] is built dynamically to encode the string. A more efficient approach is to have it written directly to the OutputStream. e.g.

OutputStreamWriter osw = new OutputStreamWriter(out, StandardCharsets.UTF_8);
// look Mum, no byte[] needed
osw.write(text);

An alternetive would to use DataOutputStream.writeUTF if you need a binary format.

Java: String.getBytes(Charset) Vs. Charset.encode(String) for use with OutputStream

Tags:

java

string

character-encoding

encoding

bytebuffer

dammkewl

1 Answers

Peter Lawrey

Recent Activity

Donate For Us

Java: String.getBytes(Charset) Vs. Charset.encode(String) for use with OutputStream

Tags:

java

string

character-encoding

encoding

bytebuffer

dammkewl

1 Answers

Peter Lawrey

Related questions

Recent Activity

Donate For Us