Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java String.getBytes( charsetName ) vs String.getBytes ( Charset object )

Tags:

I need to encode a String to byte array using UTF-8 encoding. I am using Google guava, it has Charsets class already define Charset instance for UTF-8 encoding. I have 2 ways to do:

  1. String.getBytes( charsetName )

    try {             byte[] bytes = my_input.getBytes ( "UTF-8" ); } catch ( UnsupportedEncodingException ex) {  } 
  2. String.getBytes( Charset object )

    // Charsets.UTF_8 is an instance of Charset      byte[] bytes = my_input.getBytes ( Charsets.UTF_8 ); 

My question is which one I should use? They return the same result. For way 2 - I don't have to put try/catch! I take a look at the Java source code and I see that way 1 and way 2 are implemented differently.

Anyone has any ideas?

like image 887
Loc Avatar asked Apr 26 '14 21:04

Loc


1 Answers

If you are going to use a string literal (e.g. "UTF-8") ... you shouldn't. Instead use the second version and supply the constant value from StandardCharsets (specifically, StandardCharsets.UTF_8, in this case).

The first version is used when the charset is dynamic. This is going to be the case when you don't know what the charset is at compile time; it's being supplied by an end user, read from a config file or system property, etc.

Internally, both methods are calling a version of StringCoding.encode(). The first version of encode() is simply looking up the Charset by the supplied name first, and throwing an exception if that charset is unknown / not available.

like image 183
Brian Roach Avatar answered Oct 21 '22 14:10

Brian Roach